Data Mining Assignment Help
School of Physics, Engineering and Computer Science
Assignment Briefing Sheet (2020/21 Academic Year)
Section A: Assignment title, important dates and weighting
Assignment title: Practical Group or individual:
Individual
Module title: Data Mining Module code:
7COM1018
Module leader: Peter Lane Moderator’s initials:
VS
Submission deadline:
26th April 2021 23:59 Target date for return of marked assignment:
24th May 2021
You are expected to spend about 40 hours to complete this assignment to a satisfactory standard.
This assignment is worth 40% of the overall assessment for this module.
Section B: Student(s) to complete
Student ID number Year Code
NOT NEEDED FOR ONLINE SUBMISSION
Notes for students For undergraduate modules, a score above 40% represent a pass performance at honours level. For postgraduate modules, a score of 50% or above represents a pass mark. Late submission of any item of coursework for each day or part thereof (or for hard copy submission
only, working day or part thereof) for up to five days after the published deadline, coursework relating to modules at Levels 0, 4, 5, 6 submitted late (including deferred coursework, but with the exception of referred coursework), will have the numeric grade reduced by 10 grade points until or unless the numeric grade reaches or is 40. Where the numeric grade awarded for the assessment is less than 40, no lateness penalty will be applied.
Late submission of referred coursework will automatically be awarded a grade of zero (0). Coursework (including deferred coursework) submitted later than five days (five working days in the
case of hard copy submission) after the published deadline will be awarded a grade of zero (0). Regulations governing assessment offences including Plagiarism and Collusion are available from
https://www.herts.ac.uk/about-us/governance/university-policies-and-regulations-uprs/uprs (please refer to UPR AS14)
Guidance on avoiding plagiarism can be found here: https://herts.instructure.com/courses/61421/pages/referencing-avoiding-plagiarism? module_item_id=779436
Modules may have several components of assessment and may require a pass in all elements. For further details, please consult the relevant Module Handbook (available on Studynet/Canvas, under Module Information) or ask the Module Leader.
Page 1 of 3
https://www.herts.ac.uk/about-us/governance/university-policies-and-regulations-uprs/uprs
https://herts.instructure.com/courses/61421/pages/referencing-avoiding-plagiarism?module_item_id=779436
https://herts.instructure.com/courses/61421/pages/referencing-avoiding-plagiarism?module_item_id=779436
School of Physics, Engineering and Computer Science
Assignment Briefing Sheet (2020/21 Academic Year)
This Assignment assesses the following module Learning Outcomes (from Definitive Module Document):
Successful students will typically:
2. be able to appreciate the strengths and limitations of various data mining models; 3. be able to critically evaluate, articulate and utilise a range of techniques for designing data mining systems; 5. be able to critically evaluate different algorithms and models of data mining.
Assignment Brief:
A dataset of text is provided in the assignment area on Canvas. Analyse this data using the WEKA toolkit and tools introduced within this module, comparing two different forms of preprocessing: For example, you may investigate the impact of using stemming, the effect of reducing the number of features, the impact of term frequency over a simple word count, etc.
Complete the following tasks:
1. Describe which question you will be investigating (e.g. “is stemming beneficial to improving performance?”, “is the reduction of features beneficial to improving performance?”, etc.) and why you think your choice is an interesting question to investigate.
2. Convert the text dataset into TWO different databases in ARFF format, based on your chosen question. Explain the conversion techniques and parameters that you have used, along with any other pre-processing you wish to do. (Do not include a screen shot of the attributes in WEKA – you need to describe them.)
3. For each database, produce a table and a graph of classification performance against training set size for the following three classifiers: decision-tree (J48), Naïve Bayes, Support Vector Machine. For the Support-Vector Machine you must determine the kernel,and its parameters.
4. Write a conclusion. You should at least compare the performance of the different learning algorithms on your databases, and answer the question you posed in part (1).
Remember to explain the steps you have taken to complete each task in your report. Screenshots are typically not required, and should be used sparingly if at all.
Submission Requirements:
A single PDF document containing your report, to a maximum 10 pages.
Marks awarded for:
Marks will be awarded out of 100 in the proportion:
1. Question (5 marks) 2. Conversion (40 marks) 3. Training/testing (40 marks) 4. Conclusion (15 marks)
A reminder that all work should be your own. Reports exceeding the maximum length may not be marked beyond the 10 pages.
Type of Feedback to be given for this assignment:
Along with the marks, each student will receive individual written feedback on the online platform.
Page 2 of 3
School of Physics, Engineering and Computer Science
Page 3 of 3
Assignment Briefing Sheet (2020/21 Academic Year)
Section A: Assignment title, important dates and weighting
Section B: Student(s) to complete
Assignment Briefing Sheet (2020/21 Academic Year)
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more