Loading…
Transcript

Who?

  • Benoit Xhenseval
  • Tech Architect in the City for... years
  • Done and sold dot-com
  • CTO of startup developing a Securities Financing Platform
  • Looking for a bit of fun!
  • @benoitx

One for the

10 famous Belgians!

Idea?

What does it do?

  • Find Tweets with a Spotify Item (track, album, artist or playlist)
  • Supports URI, official and short URL too
  • Count the Tweets
  • Create a Global music chart!
  • Simples...

Architecture

Entreprise Java Stack

  • JDK 1.6
  • Spring 3 and Spring MVC 3
  • Quartz for scheduling
  • Maven 3 for build (jetty:run +++)
  • Hibernate
  • Transactional & Multi-threaded
  • Web site does not touch DB
  • Using Google Analytics and AdSense

Deployment

  • 7y old PC, 1GB RAM, 320 GB
  • Ubuntu 9.04
  • Tomcat 6, single WAR for now
  • MySQL 5.1

Twitter Search

  • Using Twitter4j 2.0.10
  • Search for 'spotify' every n secs.
  • Parse URI or short URL
  • Persist the Tweet and user info
  • Fetch user details (name/followers)
  • Deal with limits, broken profile images
  • Hand-over to another thread for Spotify Fetch (in mem but transactionally)

Reply to users only 1st time:

  • Avoid spamming
  • Use Spring msg to support i18n
  • Twitter determines tweet language...sometime wrong!

Spotify

  • Eliminate 'duplicate' tweets postings
  • From URL contact Spotify and get information:
  • Track / Album / Artist / Playlist
  • Store in local DB
  • Deal with connectivity/timeout issues (transactional)

Scheduled Stats

Every 30 min, refresh stats:

  • Layer of DAO to collect data (Hibernate)
  • Result objects stored in Distributed EHCache
  • Web site will use cache only and be stateless
  • Stats could be refreshed more frequently

Follow Schedule for Sending Tweets

  • Quartz (cron); send tweet with stats
  • Bitlyj to shorten URLs, using j.mp saving a FULL 2 characters... (back to the 70s?)
  • Using Geo location API for sending tweets... just because we can

Web Site

  • Spring MVC 3 RC1
  • All requests mapped via @Controller and @UrlMapping
  • Web site only uses EHCache for data
  • Rendering via JSP and JSON for charts (OpenFlashChart)
  • Average response time < 10 ms
  • Heatmap applet from JTreeMap

Feeds & Playlists

Atom Feed with Top 10 tracks

  • Using Rome, result in cache
  • URI for each entry is key for readers

Spotify Playlists

  • Top 20 for Day, Week and Month
  • Both changing and fixed

What I learned

  • Quick Development (3 days), yet robust and scalable tech

Twitter:

  • Twitter API interface quite robust (more than site). Need to deal with:
  • risk of being flagged a spammer! Solution: be very open about any unsolicited tweet
  • Slowing down the search requests
  • Duplicate Tweets appearing in search results (diff IDs but 1 will be invalid)
  • Invalid users (e.g. spammers removed)
  • Dealing with limits (150 p/h but app was easily whitelisted and its limit upped! Thank you Twitter)
  • Broken Profile images fixed thanks to Tweetimag.es They do the hard work!
  • Twitter4j does not disambiguate User Id vs Screen Name... yet!

Spotify:

  • A bit of an unknown
  • Having to deal with incomplete replies / time outs etc
  • Solution: Transactional and able to re-request missing data
  • Not using the new Meta-data official API yet... but Playlist are missing in new APIs!
  • Read fine prints to avoid chosing a domain that violates copyright...

Bitlyj:

  • Excellent, fast, supports bit.ly and j.mp
  • Twitter does not return geo-location with a search
  • Will geo location get a proper format in user profile (at the moment 'anything goes')?

Stats

In 2 months:

  • ~1,000 followers @spotichart
  • ~200,000 Tweets with 'spotify'
  • ~50,000 Tweets with a link
  • ~18,000 Tweople
  • ~25,000 Tracks
  • ~20,000 Albums
  • ~12,000 Artists
  • ~5,000 Playlists
  • AdSense Revenue: enough for a coffee
  • (not a venti)

Thanks!

@benoitx

SpotiChart.com

SpotiChart.com

  • A Developer's perspective

Unanswered Questions