Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Final Project

No description
by

Chuhan Hong

on 7 December 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Final Project

United States Domestic Passenger Enplanements'
Forecasting with ARIMA Model

2. ARIMA Model
An AutoRegressive Integrated Moving Average (ARIMA) model predicts future values of a time series by a linear combination of its past values and a series of errors (also known as random shocks or innovations).
3. Data Resource
The data sets are collected from Bureau of Transportation Statistics.
We are using the U.S. Air Carrier Traffic Statistics: Domestic Passenger – Passenger Enplanements (January 1996 to August 2016), 248 in total.
Source:
http://www.rita.dot.gov/bts/acts/customized/table?adfy=1996&adfm=1&adty=2016&adtm=8&aos=0&artd=1&arti&arts&asts&astns&astt=3&ascc&ascp=1

4. Analysis
1. Introduction
This project is aiming to predict the number of United States domestic passenger enplanements with ARIMA model.
Through this model and proper amount of data, we can easily predict the number of passengers, and the airline company can arrange and adjust routs accordingly.
Also, for each airline company, they can adjust their plane numbers based on the passengers’ number, and make proper decision on whether to investing.
Therefore, the business decision will be more efficient and leading to an increasing in profits.

6. Diagnosis
7. Conclusion

Through this model and proper amount of data, we can easily predict the number of passengers, and the airline company can arrange and adjust routs accordingly.
Also, for each airline company, they can adjust their plane numbers based on the passengers’ number, and make proper decision on whether to investing.
Therefore, the business decision will be more efficient and leading to an increasing in profits.
Diagnostic Plots
Context
1. Introduction
2. ARIMA Model
3. Data Resources
4. Analysis
5. Forecast&Test
6. Diagnosis
7. Conclusion
Group 5:
Chuhan Hong, Jiahua Gu, Runtian Liu

3. Data Modification
Log linearity, Smoothen Model and Calculate Difference.
Check ACF & PACF
4. Check ACF & PACF
These two charts show that there's a significant spike in Lag=12, 24..., we find that the ARIMA model has the seasonality. We used the seasonal ARIMA model.
First of all, the Standardized Residuals don't have Volatility Clustering.
Secondly, ACF plot doesn't have a significant auto-correlation.
Thirdly, the p-value for Ljung-Box statistic are relatively big.
Therefore, it is a good ARIMA model for forecasting.
Standardized residuals
ACF of residuals
Ljung-box statistic
Find the best ARIMA model
And we have:
Y~ARIMA(2, 1, 0) × (1, 0, 0)12
5. Forecast & Test
We used the last eight data to evaluate the ARIMA model
We used first 240 data to build model and the last 8 to test its accuracy
Then we transformed the data set into Time-Series format
Plot chart to check data tendency
4. Analysis
Almost smooth and steady
We used R to compute the parameters of the best ARIMA model.
Appendix:
#Final Project
#Group 5
#Group Member: Jiahua Gu, Chuhan Hong, Runtian Liu
#Prof: Bahaeddine Taofuik
#Date: Nov,18th

#Sources from:
#http://www.rita.dot.gov/bts/acts/customized/table?adfy=1996&adfm=1&adty=2016&adtm=8&aos=0&artd=1&arti&arts&asts&astns&astt=3&ascc&ascp=1



data.x <- read.csv("AirCarrierTrafficStatistics.csv", header=TRUE, stringsAsFactors = FALSE) # Input data

data.x$Total <- as.numeric(gsub(",","",data.x$Total))
data.test <- data.x[241:248,2]
data.model <- data.x[1:240,2]
num <- ts(data.model,start=1,frequency=12)

plot.ts(num, las=TRUE, col="black", main="passenger enplanements",xlab="Time Series", ylab="number")
dev.off()

totallog <- log(num)
totaldiff <- diff(totallog, differences=1)
plot.ts(totaldiff,las=TRUE,xlab="time", ylab="difference")

acf(totaldiff, lag.max=30)
acf(totaldiff, lag.max=30,plot=FALSE)

pacf(totaldiff, lag.max=30)
pacf(totaldiff, lag.max=30,plot=FALSE)

auto.arima(totallog,trace=T)

totalarima1 <- arima(totallog,order=c(2,1,0),seasonal=list(order=c(1,0,0),period=12),method="ML")
totalarima1

totalforecast <- forecast.Arima(totalarima1,h=8,level=c(99.5))
totalforecast
par(mfrow=c(2,1))
plot(totalforecast)
plot(log(data.x$Total),type="l",xlab="time",ylab="log(passenger enplanements")
# diagnose
tsdiag(totalarima1)
dev.off()

par(mfrow=c(2,1))
data.test-exp(totalforecast$mean)
plot(data.test-exp(totalforecast$mean),ylab="error")
plot(abs((data.test-exp(totalforecast$mean))/data.test),ylab="absolute percentage error")


Thank you!
Advantage and Disadvantage
The general form for the ARIMA model is:

Use the Specify ARIMA Model dialog for the following three orders that can be specified for an ARIMA model:
1.
The Autoregressive Order is the order (p) of the polynomial operator.
2.
The Differencing Order is the order (d) of the differencing operator.
3.
The Moving Average Order is the order (q) of the differencing operator .
4.
An ARIMA model is commonly denoted ARIMA(p,d,q). If any of p,d, or q are zero, the corresponding letters are often dropped. For example, if p and d are zero, then model would be denoted MA(q).
Full transcript