Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


eToro Openbook SEO - Node.js solution

No description

Omry Hay

on 8 May 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of eToro Openbook SEO - Node.js solution

eToro Openbook SEO So what is the problem? So what can we do? Node.js and PhantomJS Node.js Solution The solution Google crawler has a solution for Ajax call from the client to the server. When you want Google crawler to crawl a page that contains Ajax call, you simply need to add a meta tag to the HTML or a !# tag in the query string, and the crawler will create a new request with a certain query string parameters. This will give you the change to know this request came from the crawler and serve him with a full static HTML file.
https://developers.google.com/webmasters/ajax-crawling Openbook Website Openbook website is a dynamic, data driven application. We use jquery template in order to render all the dynamic content in the application. This means that the server returns a simple HTML file, and the client manage the HTML rendering in run-time, by using the API to get the relevant data.
This means that all the content and the HTML itself
is rendered in the client side. What is SEO? Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's "natural" or un-paid ("organic") search results. In general, the earlier (or higher ranked on the search results page), and more frequently a site appears in the search results list, the more visitors it will receive from the search engine's users. SEO may target different kinds of search, including image search, local search, video search, academic search, news search and industry-specific vertical search engines. Crawlers don't wait to the Javascript to load and run,
It crawls the HTML they get from the server, and
won't wait for this process or for any API call the client
makes to the server.
That's why our pages looked like this in the "eyes" of the
crawler: Well there are two solutions:
1. Create a static page for each of our views. This
means that we need to maintain about 15 views and create the client logic in the server.
2. Write something that will emulate a browser and will run the Javascript code, and when the view is ready it will return it as a full HTML. So whenever Openbook gets a request from Google crawler, we identify the request and creates a request to our Node.js server. We pass all the relevant parameters in the query string, and then the Node.js server creates the same request to Openbook using the PhantomJS headless browser. The browser run all the necessary Javascript and returns a full static HTML response to Openbook with all the relevant dynamic data. HTML Rendering Process Client request a page:
https://openbook.etoro.com/rankings 1 3 Server according to the route serves the view, which is a simple HTML:
view-source:https://openbook.etoro.com/rankings/ Client request the data through an API request:
https://openbook.etoro.com/api/rankings/180/gain/1/ 2 Client renders the page based on the data, using jquery templates. 4 Page is fully loaded 5 Node.js contains a built-in HTTP server library, making it possible to run a web server without the use of external software, such as Apache or IIS, and allowing more control of how the web server works. Node.js enables web developers to create an entire web application in JavaScript, both server-side and client-side.

Node tells the operating system that it should be notified when a new connection is made, and then it goes to sleep. If someone new connects, then it executes the callback. Each connection is only a small heap allocation. This is in contrast to today's more common concurrency model where OS threads are employed. Thread-based networking is relatively inefficient and very difficult to use. Node will show much better memory efficiency under high-loads than systems which allocate 2mb thread stacks for each connection. Furthermore, users of Node are free from worries of dead-locking the process—there are no locks. Almost no function in Node directly performs I/O, so the process never blocks. Because nothing blocks, less-than-expert programmers are able to develop fast systems.

http://nodejs.org/ PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.

It gives you the ability to run unit testing on your Javascript code.
It can help you create Programmatically capture web contents, including SVG and Canvas, and create web site screen shots with thumbnail preview.
It can help you Monitor page loading and automate performance analysis using YSlow and Jenkins. Crawler request a view Openbook server identify that it is a crawler and send the request to the Node.js server Node.js uses the PahntomJS headless browser and creates the same call to Openbook, and runs all the Javascript and Ajax calls Crawler Request Return a full snapshot of the HTML So now Google crawler see our page like this: http://webcache.googleusercontent.com/search?q=cache:mjrolXS9lLgJ:https://openbook.etoro.com/+&cd=1&hl=en&ct=clnk&gl=il#/everyone/ Sample: http://aws-ob-node.etoro.com/api/render-page/?pageURL=https://openbook.etoro.com/omryhay
Full transcript