final project

profilesain

 

Graded Assignment:  Final Project

You work for a hypothetical university as an entry level data analyst and your supervisor has task you to holistically apply all theory and application learned based on previous tasks assigned in previous weeks along with any new content for the current week to support the data mining process including the following:

  • Problem Definitions
  • Data Explorations
  • Data Preparations
  • Modeling
  • Evaluation
  • And Deployment

The overall final project paper will be a minimum of ten pages written content not including illustrations supported with a minimum five academic sources of research.

Problem Definitions

After introducing the paper, think about and discuss some potential project objectives/requirements that would fit into your existing work environment or a future work environment based on chosen fields of study.  After some thought about these objectives and requirements, transform them into a data mining problem definition. Do note these objectives and requirements leading to the development of the data mining problem definition can be fictional or hypothetical and could be fitted around some of the sample data sets in Rapid Miner Studio or other data sets of choice which can be uploaded to Rapid Miner Studio.

Data Explorations

After developing one or more problem definitions, explore existing sample data sets in Rapid Miner Studio or use any data sets of choice to be uploaded into Rapid Miner Studio. When using sample data sets in Rapid Minder Studio, be sure to use data sets not used in other written assignments in this course.  Many other data sets can be located at:

Center for Machine Learning and Intelligent Systems, (2018). Machine learning repository. UCI. Retrieved from https://archive.ics.uci.edu/ml/datasets/Abalone

Most of the files at the machine learning repository are in comma delimited files and can be uploaded to Rapid Minder Studio using the “Add Data” feature once the comma delimited files are save to a computer.

When one or more data sets are selected, discuss in detail why you chose data sets and if the data sets are in good or bad quality and whether any data cleansing is needed.

Data Preparations

After selecting and evaluating data sets describe any actions and methods taken in Rapid Miner Studio to collect, cleanse, and format the data for use.  Although models are not used in this phase, consider using some of the basic statistics and charts to verify and demonstrate visualizations of the data cleansing or preparation process.  Remember visualizations are like pictures worth thousands of words which offers much opportunity to offer written content and analysis including any initial or pre-developed decisions based on what is seen in the data and initial visualizations from charts.

Modeling

After data is prepared, apply the modeling and analysis techniques learned in this class using decision trees, association analysis, cluster analysis, and anomaly/outlier detection.  Exploration with trial and error using other modeling or process options in Rapid Minder Studio would be icing on the cake.  For example, use the “File” menu and then “New Process” to see preset processes for ideas and testing to see what happens with chosen data sets in this project.  This section of the paper will offer many opportunities to create data visualizations.

Important Note:  With limited time and exposure in this course, I would not expect complete mastery with modeling but Rapid Miner Studio has many help features and the goal is to get some output and data visualizations to support development of decision making skills. In other words, do not get frustrated or hung up if models created are not perfect

Evaluation

After applying various modeling and process techniques to chosen data, think back to any problem definitions established at the start of this project and discuss in detail if the models and process techniques solve these problems in efforts to satisfy business objectives which again can be fictional or hypothetical.  In this section of the paper, really look at data visualizations and discuss what types of decisions can be made based on output of the various models and processes used.  In other words, think of these data visualizations and output as data transformed into business intelligence which helps in the decision-making process.

Deployment

After evaluation is complete, think again about your existing or future work environment and how you would deploy this business intelligence.  In other words, what format or reports could be used and what stakeholders or personnel would see and use this analysis.

Conclusions

At this final stage of the paper, home base is very close and conclude this paper with an overall reflection of your learning in this course and how this project could potential benefit similar task you may perform on an existing or future job.

Important Reminder:  In support of this final project, review, if needed, any of the video tutorials at https://rapidminer.com/training/videos/.  Additional learning videos could be found at www.youtube.com using keyword searches with Rapid Miner Studio and associated modeling and process techniques included in the keyword searches.

Once the overall final project is completed, remember this project is to be professionally formatted using APA including an APA cover page, abstract, body pages, and reference page

Complete and submit this assignment for grading on or before the due date.  Remember, it is not a good idea to complete or attempt completing work late.   See the course syllabus and the associated late policy.

Assessment Criteria

Possible Points

Points Earned

Student included a front APA cover page (Page 1)

5

Student included an abstract (Page 2)

5

Based on a data set, problem definitions were assessed and discussed.

10

Based on data set, the data was explored and the data was prepared for analysis.

10

Data was modeled and analyzed using RapidMiner Studio

10

Data visualizations are included along with development of decisions based on these visualized outputs of data analysis techniques used.

30

In the conclusion, ideas are presented on how to deploy data analysis techniques used as business intelligence.

10

Student included in-text citations with a complete reference page properly formatted using APA

10

Student included a completed paper free of grammar and spelling issues.

10

Total Earned points

100

    • Posted: 6 months ago
    • Due: 
    • Budget: $50
    Answers 2

    Purchase the answer to view it

    blurred-text

    Purchase the answer to view it

    blurred-text