Kick Off Your Assignment for Just $10* Get Started

Solution Code: 1GJB

Question: R studio

This assignment is related to ” R studio” and experts at Grade Saviours successfully delivered HD quality work within the given deadline.

R studio

Case Scenario/ Task Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic, environmental and health issues. Opposed to other transport services such as bus or subway, the duration of travel, departure and arrival position is explicitly recorded in these systems. This feature turns bike sharing system into a virtual sensor network that can be used for sensing mobility in the city. It is expected that most of important events in the city could be detected via monitoring these data. In the finaly project, you will be analyzing the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA. The data set contains recourds of 17379 hourly counts of rentals. It was originally compiled by Fanaee and Gama in “Event labeling combining ensemble detectors and background knowledge” (2013). You will download the data set bikeshares.csv. The data set contains - instant: record index - dteday : date - season : season (1:springer, 2:summer, 3:fall, 4:winter) - yr : year (0: 2011, 1:2012) - mnth : month ( 1 to 12) - hr : hour (0 to 23) - holiday : weather day is holiday or not (extracted from https://dchr.dc.gov/page/holidayschedule) - weekday : day of the week - workingday : if day is neither weekend nor holiday is 1, otherwise is 0. - weathersit : - 1: Clear, Few clouds, Partly cloudy, Partly cloudy - 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist - 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds - 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog - temp : Normalized temperature in Celsius. The values are divided to 41 (max) - atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max) - hum: Normalized humidity. The values are divided to 100 (max) - windspeed: Normalized wind speed. The values are divided to 67 (max) - casual: count of casual users - registered: count of registered users - cnt: count of total rental bikes including both casual and registered This analysis is intentionally open ended. While you explore the data, recall the tools you have learned in class.

These assignments are solved by our professional R studio at Grade Saviours and the solution are high quality of work as well as 100% plagiarism free. The assignment solution was delivered within 2-3 Days. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.

Solution:R studio

ABSTRACT Bike-sharing systems allow people to rent a bicycle at one of many automatic rental stations scattered in the city, use them for a short journey and return them at any other station in the city. The aim of the study was to investigate factors that affect bike rentals in Capital Bikeshare. Exploratory Data Analysis, Poisson/ Quasi-Poisson models were used to analyze the data. Data was explored by use of frequency tables and graphical representation. To handle overdispersion in the dataset, Quasi-Poisson and Negative Binomial models were fitted. Bike rentals are affected by weather conditions, time of the day, month of the year, season etc. Key words:Poisson Model, Quasi-Poisson model, Overdispersion 1 INTRODUCTION Public bike sharing (PBS) systems are currently spreading across the globe and they have been gaining increasing popularity in transportation plans as a strategy to multiply travel choices, promote the use of active modes of transport, decrease dependence on automobile and especially reduce greenhouse gas emission, (Contardo et al, 2012). Bike-sharing systems allow people to rent a bicycle at one of many automatic rental stations scattered in the city, use them for a short journey and return them at any other station in the city, (Raviv et al,2011). Institute for transportation & development policy(ITDP) have stated that, more than 600 cities around the globe have their own bike-share systems, and more programs are starting every year. The largest systems are in China, in cities such as Hangzhou and Shanghai. In Paris, London, and Washington, D.C., highly successful systems have helped to promote cycling as a viable and valued transport option. In this Study, we analyzed the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA. Capital Bikeshare is the largest bike sharing program in the United States. The aim of the study was to investigate factors that affect bike rentals in Capital Bikeshare. An Exploratory Data Analysis(EDA) was done to get insight and a better understanding of the dataset. This was done by use of frequency tables and graphical representations. 1.1 Data description The data set contains records of 17379 hourly counts of rentals. It was originally compiled by Fanaee and Gama in ”Event labeling combining ensemble detectors and background knowledge” (2013). The variables used are described as follows; . instant: record index . dteday : date . season : season (1:springer, 2:summer, 3:fall, 4:winter) . yr : year (0: 2011, 1:2012) . mnth : month ( 1 to 12) . hr : hour (0 to 23) . holiday : weather day is holiday or not (extracted from https://dchr.dc.gov/page/holidayschedule) . weekday : day of the week . workingday : if day is neither weekend nor holiday is 1, otherwise is 0. . hum: Normalized humidity. The values are divided to 100 (max) . windspeed: Normalized wind speed. The values are divided to 67 (max) . weathersit : 1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog . temp : Normalized temperature in Celsius. The values are divided to 41 (max) . atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max) . casual: count of casual users . registered: count of registered users . cnt: count of total rental bikes including both casual and registered The response variable used was count of total rental bikes including both casual and registered(cnt). 3 1.2 METHODOLOGY A Poisson distribution was assumed in this study since the response variable was a count data. The Poisson distribution is used for counts of events that occur randomly over time or space, when outcomes in disjoint periods or regions are independent.The mean and variance of Poisson distribution are the same,(Agresti, 2012). The Poisson distribution has positive mean µ. It is more common to model the log mean as it can take any real value. A Poisson loglinear model with explanatory variables is given by cnti ? Poisson(µi) log(µi) = mnthik+seosonij+holidayi+hrir+workingdayi+humi+windspeedi+weathersitim+tempi+atempi Where; k=1,..., 12 j=1,2,..., 4 r= 0,.., 23 i=1,2,..., 17379 For count data, Poisson assumption is often unrealistic because of overdispersion-the variance exceeds the mean,(Agresti, 2012). One can use Quasi-Poisson or Negative binomial models to handle over-dispersion. Quasi-Poisson model estimates a scale parameter as well, and also fixes the estimated standard error. It uses quasi-likelihood estimation which assumes only mean-variance relationship rather than a specific distribution of response variable. Negative binomial model contains an extra parameter ?, which is the parameter of multiplicative random effect. It permits µ to depend on explanatory variables,(Agresti, 2012). All the models were fitted and compared using AIC. R software version 3.1.1 was used to analyze the data. 2 RESULTS AND DISCUSSION 2.1 Exploratory Data Analysis The mean and variance of response variable (cnt) were not the same. The mean and variance values were 189.4631 and 32901.46, respectively.This implies there is over-dispersion i.e there is greater variability in count data. To observe how the response variable is distributed, an histogram of cnt was plotted. We observe a very positive skewed distribution, with largest observed value equal to 977.
Response Variable
From Table 1, it can be observed that, during working days there were many bike rentals compared to non-working days. There were many bike rental during summer(season 2) and fall(season 3) and fewer during winter. From the table it can be seen that during, heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog(weather condition 4) only 3 bike rentals were made over the two years. This implies that very few bike rental were made during this weather condition. There is also a decreasing trend of bike rentals when weather conditions grow worse. Data was also explored by use of scatter plots and box plots. Figure 2 below shows scatter plots of number of bike rentals against atemp, temp and wind speed, respectively. A random sample of cnt was selected in order to make these plots. It can be observed that many people tend to rent bikes when its warmer and wind is calm. Weather condition variables It also implies that on overage, there were more bike rentals in 2012 as compared to 2011. Box plots in Figure 3 shows different times when bikes were rented. It can be seen clearly that on average, in each day there was a peak at around 8 a.m and in the evening at around 5 pm - 6 pm. On average in each year, renting of bikes decrease from month of December- February(winter season) and increases from March. Time Variable 2.2 Poisson regression Dummy variables were created for all categorical explanatory variables. With Poisson regression model, all variables included in the model were significant. When all the other variables are held constant, the average number of rented bikes in February(mnth 2) was about 1.089(exp(0.0855)) times or increased by 8% compared to January(mnth, reference variable). In addition, the average number of rented bikes at 8 a.m was about 6.8093(exp(1.9183)) times compared to 12 midnight. The deviance 744445.1 and degrees of freedom 17378 of Poisson regression model suggested that the model didnt fit the data well. This was in line with what we observed in EDA that there was overdispersion. To handle over-dispersion, Quasi-Poisson and Negative binomial models were fitted. Table 2 gives parameter estimates of Poisson/Quasi-Poisson regression and Negative binomial model. The standard errors for Quasi-poisson regression model are larger than those of Poisson regression models. The dispersion parameter of Quasi-Poisson regression model was 43.4224 which implies that there is indeed over dispersion in the data set and Poisson model underestimated the standard errors. The AIC of Negative binomial model(value= 191 283) was smaller than that of Poisson regression model( value= 7 855 440). The deviance of Negative binomial model was also much smaller than that of Poisson regression. This implies that Negative binomial improves Poisson model and deals with overdispersion quite well. The estimated dispersion parameter for Negative Binomial was 3.1038 indicating that Negative binomial is more appropriate than Poisson regression model for this data set. Estimated coefficients table of Poisson/Quasipoisson regression and Negative binomial model 3 CONCLUSION and RECOMMENDATION The study aimed to find whether there are factors that affect bike rental patterns in Capital Bikeshare system. From EDA it was found that, season, year,month, hour, holiday, working day and weathers conditions affects bike rental patterns. There was over-dispersion in dataset. This was observed both in EDA and fitting Poisson regression model. To handle Over-dispersion, Quasi- Poisson and Negative binomial models were fitted. It was found that Negative Binomial did improve Poisson regression model. The standard errors in QuasiPoisson model were larger than in Poisson regression model. This implies that that there was over dispersion in the data set and Poisson model underestimated the standard errors. All variables were scientifically significant, implying that the variables can be used to predict bike rental patterns. In this study, it was found that there was heterogeneity among variables used. Other methods that account for heterogeneity can be explored.This was not done due to limited time. For further studies, interactions between explanatory variables can be included in the analyzes.

Find Solution for R studio by dropping us a mail at help@gradesaviours.com along with the question’s URL. Get in Contact with our experts at Grade Saviours and get the solution as per your specification & University requirement.

Hey MAS, I need Assignment Sample of

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+
  • 1,212,718Orders

  • 4.9/5Rating

  • 5,063Experts

Highlights

  • 21 Step Quality Check
  • 2000+ Ph.D Experts
  • Live Expert Sessions
  • Dedicated App
  • Earn while you Learn with us
  • Confidentiality Agreement
  • Money Back Guarantee
  • Customer Feedback

Just Pay for your Assignment

  • Turnitin Report

    $10.00
  • Proofreading and Editing

    $9.00Per Page
  • Consultation with Expert

    $35.00Per Hour
  • Live Session 1-on-1

    $40.00Per 30 min.
  • Quality Check

    $25.00
  • Total

    Free
  • Let's Start

Get
500 Words Free
on your assignment today

Browse across 1 Million Assignment Samples for Free

Explore All Assignment Samples

Request Callback

My Assignment Services- Whatsapp Get Best OffersOn WhatsApp

Get 500 Words FREE