University | Singapore University of Social Science (SUSS) |
Subject | ST2195 Programming For Data Science Report |
ST2195 Programming For Data Science Report: The 2009 ASA Statistical Computing and Graphics Data Expo consisted of flight arrival
Part 2
The 2009 ASA Statistical Computing and Graphics Data Expo consisted of flight arrival and departure details for all commercial flights on major carriers within the USA from October 1987 to April 2008. This is a large dataset; there are nearly 120 million records in total, and it takes up 1.6 gigabytes of space when compressed and 12 gigabytes when uncompressed.
The complete dataset, along with supplementary information and variable descriptions can be downloaded from the Harvard Dataverse at https://doi.org/10.7910/DVN/HG7NV7 Choose any subset of ten consecutive years and any of the supplementary information provided by the Harvard Dataverse to answer the following questions using the principles and tools you have learned in this course:
(a) What are the best times and days of the week to minimise delays each year?
(b) Evaluate whether older planes suffer more delays on a year-to-year basis.
(c) For each year, fit a logistic regression model for the probability of diverted US flights using as many features as possible from attributes of the departure date, the scheduled departure and arrival times, the coordinates and distance between departure and planned arrival airports, and the carrier. Visualize the coefficients across years.
Stuck with a lot of homework assignments and feeling stressed ? Take professional academic assistance & Get 100% Plagiarism free papers
General Instructions
• All questions should be answered using R and Python for all tasks.
• Your answers should be provided in a separate structured report of no more than 1 page for part 1 and 6 pages for part 2. The page limit excludes titles, references, and table of contents but includes graphics and tables. The report should be in PDF format and also contain adequate explanations for readers not familiar with programming. In addition to the report, you will also be asked to provide your R and Python code in RMarkdown and Jupyter notebooks, respectively. All the relevant files must be submitted in the designated Atrio or VLE submission portal.
• For part 2, each report should detail all steps you took starting from raw data up to the answer for each question. Any databases you set up, data wrangling/cleaning operations you carry out, and any modeling decisions you make should be clearly described in each structured report. Each report should also include any relevant graphics and tables as part of the answer.
• If you are using elements (e.g. code, databases, graphics, etc) from your answer to a previous question to answer the current one, you will need to refer to those elements.
• You should also supply the code you used to answer each question, in a way that can be used by someone else to replicate your analyses. You can do this either as separate scripts or separate RMarkdown/Jupyter notebooks per question, clearly indicating (both with comments and in the filename) which question each script refers to.
Hire a Professional Essay & Assignment Writer for completing your Academic Assessments
Tags:-
- ELG101: Discovering Language Assignment, SUSS, Singapore: Speakers are creative users of the languages they speak. In our everyday interaction
- ELG101: Discovering Language Assignment, SUSS, Singapore: Demonstrate the articulation of the words in the above data set using the International Phonetic
- Principles of Project Management Individual Assignment, SUSS, Singapore: You are appointed a Project Manager in charge of organising and planning a project
- COR2223: Frontiers of Modern Computing Final Written Assignment, SMU, Singapore: Write a short story imagining a future scenario and how one or more of the modern computing technologies
- AC2101: Accounting Recognition and Measurement Assignment, NTU, Singapore: If NTU-C were to record the abovementioned transactions on 31 January, 28 February and 15 March 20×2
- BSE217: Motor Development, Control and Learning Assignment, SUSS, Singapore: Create a rating scale or categories of differentiation to rate the movement proficiencies of the individual
- BSE217: Motor Development, Control and Learning Assignment, SUSS, Singapore: Discuss the main theoretical foundations of Ecological Dynamics Theory
- NCO201: Learn To Learn, Learn For Life Tutor-Marked Assignment 2, SUSS, Singapore: Create a learning plan. Use this template to create your learning plan
- NCO201: Learn To Learn, Learn For Life Tutor-Marked Assignment 2, SUSS, Singapore: Self-reflection is a critical part of learning. By looking back at your process and analysing
- BPM301: LCC and Sustainable Design and Construction Assignment, SUSS, Singapore: A developer is building a new 3-storey sports complex on a 10-year leasehold land
UP TO 15 % DISCOUNT