Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.



No description

Rohit Nair

on 24 March 2014

Comments (0)

Please log in to add your comment.

Report abuse


As a completely new paradigm of Internet, researches on the Internet of Things are still at the preliminary stage.

Currently, there are some works about data mining in the Internet of Things, which mainly include the following three aspects:
Some works focus on managing and mining RFID stream data.
Some works interest in query, analyze and mine moving object data which is generated by various devices of IoT, .
Other works are knowledge discovery from sensor data. Sensor network has several characteristics, So data mining in sensor networks has its own features.

Although there are several contributions towards data mining from IoT, they mainly focus on the rudiments IoT, eg. sensor network, RFID etc. As a completely new paradigm of Internet, IoT is still lack of models and theories for directing its data mining.

Multi-layer data mining model for IoT

Distributed data mining model for IoT

Grid based data mining model for IoT

Data mining model for IoT from multi-technology integration perspective.


The Internet of Things (IoT) refers to the next generation of Internet which will contain trillions of nodes representing various objects from small ubiquitous sensor devices and handhelds to large web servers and supercomputer clusters .

It is the next technological revolution after the revolution of computer and Internet. It integrates the new technologies of computing and communications and builds the development direction of the next generation of internet.

IoT is the core of Smart Planet that is proposed by IBM Corporation. Smart objects of the Internet of Things are able to communicate via the internet based on the new technologies of information and communication.

The data in the Internet of Things can be categorized into several types: RFID data stream, address/unique identifiers, descriptive data, positional data, environment data and sensor network data etc . It brings the great challenges for managing, analyzing and mining data in the Internet of Things.

Data mining 
The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.
Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures,  visualization, and online updating.


Rohit Nair

“From the viewpoint of technology, IoT is an integration of sensor networks, which include RFID, and ubiquitous network. From the viewpoint of economy, it is an open concept, which integrates new related technologies and applications, productions and services, R. & D., industry and market.”

The Internet of Things will produce large volumes of data. Let us take a supermarket in a supply chain, which adopts RFID technology, as an example. The format of raw RFID data is of the form: EPC, location, time. EPC represents the unique identifier read by an RFID reader; location is the place where the reader is positioned; and time is the time when the reading took place. It needs about 18 bytes to save a raw RFID record. In a supermarket, there are about 700,000 RFID tag. So for a RFID data stream of a supermarket, if the supermarket has readers that scan the items every second, about 12.6 GB RFID data will be produced per second, and the data will reach 544TB per day. Thus, it is necessary to develop effective methods for managing, analyzing and mining RFID data.

Today computers—and, therefore, the Internet—are almost wholly dependent on human beings for information.

Nearly all of the roughly 50 petabytes (a petabyte is 1,024 terabytes) of data available on the Internet were first captured and created by human beings—by typing, pressing a record button, taking a digital picture or scanning a bar code.

Conventional diagrams of the Internet ... leave out the most numerous and important routers of all - people. The problem is, people have limited time, attention and accuracy—all of which means they are not very good at capturing data about things in the real world.

The Internet of Things has the potential to change the world, just as the Internet did. Maybe even more so.

It sounds like mission impossible to connect everything on the earth together via internet, but Internet of Things (IoT) will dramatically change our life in the foreseeable future, by making many “impossible” possible.
To many, the massive data generated or captured by IoT are considered having highly useful and valuable information. Finally, changes, potentials, open issues, and future trends of this field are addressed.
In this paper, we propose four data mining models for the Internet of Things, which are multi-layer data mining model, distributed data mining model, Grid based data mining model and data mining model from multi-technology integration perspective.
Among them, multi-layer model includes four layers: 1) data collection layer, 2) data management layer, 3) event processing layer, and 4) data mining service layer.
Several key issues in data mining of IoT are also discussed.

Data mining (the analysis step of the "Knowledge Discovery and Data Mining" process, or KDD), an interdisciplinary subfield of computer science is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.
Internet of Things(IoT)
Multi-layer data mining model for IoT
According to the architecture of IoT and data mining framework of RFID , we propose the following multilayer data mining model for IoT as shown in Fig 1, which is divided into four layers: data collection layer, data management layer, event processing layer and data mining service layer.

Among them, data collection layer adopts devices.Different type of data requires different data collection strategy. In the process of data collection, a series of problems, e.g. energy-efficiency, misreading, repeated reading, fault tolerance, data filtering and communications etc., should be well solved.

Data management layer applies centralized or distributed database or data warehouse to manage collected data. After object identification, data abstraction and compression, various data are saved in the corresponding database or data warehouse. Take RFID data as an example, the raw format of RFID data stream is (EPC, location, time), where EPC marks smart object’s ID.Smart objects are connected with each other via the data management layer in the Internet of Things.

Event is an integration that combines data, time and other factors, so it provides a high-level mechanism for data processing of IoT. Event processing layer is used to analyze events in IoT effectively. Thus we can perform event-based query or analysis in event processing layer.

Data mining service layer is built based on data management and event processing. Various object-based or event-based data mining services, such as classification, forecasting, clustering, outlier detection, association analysis or patterns mining, are provided for applications, e.g., supply chain management, inventory management and optimization etc. The architecture of this layer is service-oriented.

Distributed data mining model for IoT
In this model, the global control node is the core of the whole data mining system. It chooses the data mining algorithm and the data sets for mining, and then navigates to the sub-nodes containing these data sets.

The sub-nodes receive the raw data from various smart objects. These raw data is pre-processed by data filter, data abstraction and data compression, and then is saved in the local data warehouse. Local models are obtained by event filtering, complex event detection and data mining in local nodes.

According to the demand of the global control node, these local models are submitted to the global control node and aggregated together to form the global model. Sub-nodes exchange object data, process data and knowledge with each other. The whole process is controlled by the multiagent based collaborative management mechanism.

The basic idea of IoT is to connect various smart objects via internet.
Thus smart objects become intelligent, context-awareness, and long-range operable. Therefore we may regard smart objects of IoT as a kind of resources for Grid computing, and then use data mining services of Grid to implement the data mining operations for IoT.

In this paper, based on DataMiningGrid which was put forward by Stankovski, V. et al., we propose a Grid-based data mining model for IoT, as shown in Fig.

It also offers various software resources, e.g., event processing algorithms, data warehouse and data mining applications etc. Globus Toolkit 4 is adopted to implement various services of Grid. We also can make full use of high-level services of DataMiningGrid, client components of DataMiningGrid for data mining in IoT.

Data mining model for IoT from multi-technology integration perspective
Grid based data mining model for IoT
The Internet of Things is one of the most important development directions of the next- generation Internet.

At the same time, there are still a number of new directions, e.g., trusted network, ubiquitous network, grid computing, cloud computing etc. Therefore, from the perspective of multi-technology integration, we propose the corresponding data-mining model for IoT, as shown in Figure.
In this model, data comes from the context-awareness of individuals, smart objects or the environment.

128-bit IPV6 address is adopted, and a variety of ubiquitous ways are provided for accessing to the future Internet, such as:
Intranet/Internet, FTTx/xDSL, sensor devices, RFID, WLAN/WiMAX, 2.5/3/4G mobile access and so on. Trusted control plane is able to ensure credibility and controllability of data transmission.

On this basis, we carry out data mining tools and algorithms, and submit gained knowledge to various service-oriented applications, such as intelligent transportation, intelligent logistics etc.

Data collection from smart objects of IoT 

Data abstraction, compression, index, aggregation and multi-dimensional query.

Event filtering, aggregation and detection.

Data mining towards the next generation of Internet.

One possible way to make a system smart, or “think” like a human, is using data mining technologies, but how to make it work as expected is still a difficult problem today, just like we have gone a long way trying to make computers think by themselves.

To develop a convenient, high-performance, and intelligent IoT system using the data mining technologies, we need to know the key issues they face.

The focus in this section is thus on three different research aspects: the changes, the potentials, and the open issues of the IoT in using data mining technologies in the design, integration, development, and use of a smart IoT system.

The potentials of applying data mining technologies to the IoT can be summarized as follows:
1) To people: One of the potentials is to provide a more accurate suggestion; that is, the IoT acts like a semi-automatic intelligent system.Combining mining technologies with iterative learning methods will bring the IoT up to another level in the sense that the system now may be able to learn incrementally and even interact with people.
2) To themselves: One of the promising researches on the smart object is for things to think by themselves. In practice, mining technologies have the potential to filter out the redundant data and to decide what kind of data or information needs to be uploaded to the system that will be useful for applications on a broad region with limited resources, such as precision agriculture and natural resources monitoring.
3) To other things: Another key research issue on the IoT or M2M is communication and collaboration with other things.If the objects or systems are similar to each other, thus having similar goals or requirements, clustering technologies will be able to be used to group them into the same group so that the objects can easily know which objects need the information they hold (those on the same group) and which objects do not (those on different groups).

As an important development direction of the next generation of Internet, the Internet of Things attracts many attentions by industry world and academic circles.

IoT data has many characteristics, such as distributed storage, mass time-related and position-related data, and limited resources of nodes etc. These makes the problem of data mining in IoT become a challenge task.

Distributed data mining model can well solve the problem arose from depositing data at different sites. At the same time, the complexity of problem is decomposed, and the requirements of high performance, high storage capacity and high computing power for central nodes are reduced.

Grid based data mining model adopts Grid framework to realize the functions of data mining.

Data mining model from multitechnology integration perspective describes the corresponding framework for the future Internet.

To handle the flood of big data, sampling technologies, compression technologies, incremental learning technologies, and filtering technologies all will become more and more important, especially when using data mining technologies to analyze the data of IoT.
Thank you!!
Full transcript