IT related Discussion


Task 1: 200-250 words with references

Why is outlier mining important? Briefly describe the different approaches behind statistical-based outlier detection, distanced-based outlier detection, density-based local outlier detection, and deviation-based outlier detection.

Task 2: 200-250 words with references

A group of students are linked to each other in a social network via advisors, courses, research groups, and friendship relationship. Present a clustering method (data mining related) that may partition students into different groups according to their research interest.

Task 3: 200-250 words with references

What are the differences between visual data mining and data visualization? Data visualization may suffer from the data abundance problem. For example, it is not easy to visually discover interesting properties of network connections if a social network is huge with complex and dense connections. Propose a visualization method that may help people see through the network topology to the interesting features of a social network.

Task 4: 200-250 words with references

An e-mail database is a database that stores a large number of electronic mail (e-mail) messages. It can be viewed as a semi-structured database consisting mainly of text data. Discuss the following:

a. What can be mined from such an e-mail database?

b. Suppose you have roughly classified a set of your previous e-mail messages as junk, unimportant, normal, or important. What type of data mining problem or problem is/are this? Describe how a data mining system may take this as the training set to automatically classify new e-mail messages or unclassified ones.

Task 5: 200-250 words with references

Suppose that your local bank has a data mining system. The bank has been studying your credit and debit card usage patterns. Noticing that you make many transactions at home renovation stores, the bank decides to contact you, offering information regarding their special loans for home improvements. Discuss how this may conflict with your right to privacy.