SpotiChart.com Who? Idea? Architecture Twitter Search Spotify Scheduled Stats Web Site Feeds & Playlists What I learned Unanswered Questions Stats Thanks! @benoitx SpotiChart.com A Developer's perspective Benoit Xhenseval Tech Architect in the City for... years Done and sold dot-com CTO of startup developing a Securities Financing Platform Looking for a bit of fun! @benoitx What does it do? Find Tweets with a Spotify Item (track, album, artist or playlist) Supports URI, official and short URL too Count the Tweets Create a Global music chart! Simples... Entreprise Java Stack JDK 1.6 Spring 3 and Spring MVC 3 Quartz for scheduling Maven 3 for build (jetty:run +++) Hibernate Transactional & Multi-threaded Web site does not touch DB Using Google Analytics and AdSense Using Twitter4j 2.0.10 Search for 'spotify' every n secs. Parse URI or short URL Persist the Tweet and user info Fetch user details (name/followers) Deal with limits, broken profile images Hand-over to another thread for Spotify Fetch (in mem but transactionally) Reply to users only 1st time: Avoid spamming Use Spring msg to support i18n Twitter determines tweet language...sometime wrong! Eliminate 'duplicate' tweets postings From URL contact Spotify and get information: Track / Album / Artist / Playlist Store in local DB Deal with connectivity/timeout issues (transactional) Every 30 min, refresh stats: Layer of DAO to collect data (Hibernate) Result objects stored in Distributed EHCache Web site will use cache only and be stateless Stats could be refreshed more frequently Follow Schedule for Sending Tweets Quartz (cron); send tweet with stats Bitlyj to shorten URLs, using j.mp saving a FULL 2 characters... (back to the 70s?) Using Geo location API for sending tweets... just because we can Spring MVC 3 RC1 All requests mapped via @Controller and @UrlMapping Web site only uses EHCache for data Rendering via JSP and JSON for charts (OpenFlashChart) Average response time < 10 ms Heatmap applet from JTreeMap Atom Feed with Top 10 tracks Using Rome, result in cache URI for each entry is key for readers Spotify Playlists Top 20 for Day, Week and Month Both changing and fixed Quick Development (3 days), yet robust and scalable tech Twitter: Twitter API interface quite robust (more than site). Need to deal with: risk of being flagged a spammer! Solution: be very open about any unsolicited tweet Slowing down the search requests Duplicate Tweets appearing in search results (diff IDs but 1 will be invalid) Invalid users (e.g. spammers removed) Dealing with limits (150 p/h but app was easily whitelisted and its limit upped! Thank you Twitter) Broken Profile images fixed thanks to Tweetimag.es They do the hard work! Twitter4j does not disambiguate User Id vs Screen Name... yet! Spotify: A bit of an unknown Having to deal with incomplete replies / time outs etc Solution: Transactional and able to re-request missing data Not using the new Meta-data official API yet... but Playlist are missing in new APIs! Read fine prints to avoid chosing a domain that violates copyright... Bitlyj: Excellent, fast, supports bit.ly and j.mp Twitter does not return geo-location with a search Will geo location get a proper format in user profile (at the moment 'anything goes')? In 2 months: ~1,000 followers @spotichart ~200,000 Tweets with 'spotify' ~50,000 Tweets with a link ~18,000 Tweople ~25,000 Tracks ~20,000 Albums ~12,000 Artists ~5,000 Playlists AdSense Revenue: enough for a coffee (not a venti) One for the 10 famous Belgians! Deployment 7y old PC, 1GB RAM, 320 GB Ubuntu 9.04 Tomcat 6, single WAR for now MySQL 5.1
A Developer's perspective