Who?
- Benoit Xhenseval
- Tech Architect in the City for... years
- Done and sold dot-com
- CTO of startup developing a Securities Financing Platform
- Looking for a bit of fun!
- @benoitx
One for the
10 famous Belgians!
Idea?
What does it do?
- Find Tweets with a Spotify Item (track, album, artist or playlist)
- Supports URI, official and short URL too
- Count the Tweets
- Create a Global music chart!
- Simples...
Architecture
Entreprise Java Stack
- JDK 1.6
- Spring 3 and Spring MVC 3
- Quartz for scheduling
- Maven 3 for build (jetty:run +++)
- Hibernate
- Transactional & Multi-threaded
- Web site does not touch DB
- Using Google Analytics and AdSense
Deployment
- 7y old PC, 1GB RAM, 320 GB
- Ubuntu 9.04
- Tomcat 6, single WAR for now
- MySQL 5.1
Twitter Search
- Using Twitter4j 2.0.10
- Search for 'spotify' every n secs.
- Parse URI or short URL
- Persist the Tweet and user info
- Fetch user details (name/followers)
- Deal with limits, broken profile images
- Hand-over to another thread for Spotify Fetch (in mem but transactionally)
Reply to users only 1st time:
- Avoid spamming
- Use Spring msg to support i18n
- Twitter determines tweet language...sometime wrong!
Spotify
- Eliminate 'duplicate' tweets postings
- From URL contact Spotify and get information:
- Track / Album / Artist / Playlist
- Store in local DB
- Deal with connectivity/timeout issues (transactional)
Scheduled Stats
Every 30 min, refresh stats:
- Layer of DAO to collect data (Hibernate)
- Result objects stored in Distributed EHCache
- Web site will use cache only and be stateless
- Stats could be refreshed more frequently
Follow Schedule for Sending Tweets
- Quartz (cron); send tweet with stats
- Bitlyj to shorten URLs, using j.mp saving a FULL 2 characters... (back to the 70s?)
- Using Geo location API for sending tweets... just because we can
Web Site
- Spring MVC 3 RC1
- All requests mapped via @Controller and @UrlMapping
- Web site only uses EHCache for data
- Rendering via JSP and JSON for charts (OpenFlashChart)
- Average response time < 10 ms
- Heatmap applet from JTreeMap
Feeds & Playlists
Atom Feed with Top 10 tracks
- Using Rome, result in cache
- URI for each entry is key for readers
Spotify Playlists
- Top 20 for Day, Week and Month
- Both changing and fixed
What I learned
- Quick Development (3 days), yet robust and scalable tech
Twitter:
- Twitter API interface quite robust (more than site). Need to deal with:
- risk of being flagged a spammer! Solution: be very open about any unsolicited tweet
- Slowing down the search requests
- Duplicate Tweets appearing in search results (diff IDs but 1 will be invalid)
- Invalid users (e.g. spammers removed)
- Dealing with limits (150 p/h but app was easily whitelisted and its limit upped! Thank you Twitter)
- Broken Profile images fixed thanks to Tweetimag.es They do the hard work!
- Twitter4j does not disambiguate User Id vs Screen Name... yet!
Spotify:
- A bit of an unknown
- Having to deal with incomplete replies / time outs etc
- Solution: Transactional and able to re-request missing data
- Not using the new Meta-data official API yet... but Playlist are missing in new APIs!
- Read fine prints to avoid chosing a domain that violates copyright...
Bitlyj:
- Excellent, fast, supports bit.ly and j.mp
- Twitter does not return geo-location with a search
- Will geo location get a proper format in user profile (at the moment 'anything goes')?
Stats
In 2 months:
- ~1,000 followers @spotichart
- ~200,000 Tweets with 'spotify'
- ~50,000 Tweets with a link
- ~18,000 Tweople
- ~25,000 Tracks
- ~20,000 Albums
- ~12,000 Artists
- ~5,000 Playlists
- AdSense Revenue: enough for a coffee
- (not a venti)
Thanks!
@benoitx
SpotiChart.com
SpotiChart.com
- A Developer's perspective
Unanswered Questions