MIT401– Data Warehousing and Data Mining

Dear students get fully solved assignments

Send your semester & Specialization name to our mail id :

help.mbaassignments@gmail.com

or

call us at : 08263069601

 

 

ASSIGNMENT

 

PROGRAM Master of Science in Information Technology(MSc IT)Revised Fall 2011
SEMESTER 4
SUBJECT CODE & NAME MIT401– Data Warehousing and Data Mining
CREDIT 4
BK ID B1633
MAX.MARKS 60

 

Note: Answer all questions. Kindly note that answers for 10 marks questions should be approximately of 400 words. Each question is followed by evaluation scheme.

 

 

 

1 Explain the Top-Down and Bottom-up Data Warehouse development Methodologies.

Answer: Data warehouse systems have gained popularity as companies from the most varied industries realize how useful these systems can be. A large number of these organizations, however, lack the experience and skills required to meet the challenges involved in data warehousing projects. In particular, a lack of a methodological approach prevents data warehousing projects from being carried out successfully. Generally, methodological approaches are created by closely studying similar experiences and minimizing the

 

 

 

 

2 Explain the Functionalities and advantages of Data Warehouses.

Answer: A common way of introducing data warehousing is to refer to the characteristics of a data warehouse.

  • Subject Oriented
  • Integrated
  • Nonvolatile
  • Time Variant

 

Subject Oriented: Data warehouses are designed to help you analyze data. For example, to learn more about your company’s sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions

 

 

 

3 Describe about Hyper Cube and Multicube.

 

Answer: Multidimensional databases can present their data to an application using two types of cubes: hypercubes and multicubes. In the hypercube model, as shown in the following illustration, all data appears logically as a single cube. All parts of the manifold represented by this hypercube have identical dimensionality.

 

In the multicube model, data is segmented into a set of smaller cubes, each of which is composed of a subset of the available dimensions, as shown in the following illustration:

Hypercubes and multicubes differ in terms of available metadata. In a hypercube, each dimension belongs to one cube only. A dimension is “owned” by the hypercube. In a multicube, a dimension can be part of multiple cubes. That is, dimensions are

 

 

 

 

4 List and explain the Strategies for data reduction.

 

Answer: Data reduction is the process of minimizing the amount of data that needs to be stored in a data storage environment. Data reduction can increase storage efficiency and reduce costs.

 

Strategies for data reduction:

TAKE ADVANTAGE OF EXISTING INFORMATION: First of all, we don’t want to reinvent the wheel. There’s a lot of existing information out there for community health coalitions to take advantage of. Know your community’s history! Has this

 

 

 

 

 

 

 

5 Describe K-means method for clustering. List its advantages and drawbacks.

 

Answer: k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.

 

The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm

 

 

 

6 Describe about Multilevel Databases and Web Query Systems.

 

Answer: Multilevel Databases: The main idea behind this approach is that the lowest level of the database contains semi-structured information stored in various Web repositories, such as hypertext documents. At the higher level(s) meta data or generalizations are extracted from lower levels and organized in structured collections, i.e. relational or object oriented databases. For example, Han, et. al. use a multilayered database where each layer is obtained via generalization and transformation operations performed on the lower layers. Kholsa, et. al. propose the creation and maintenance of meta-databases at each information providing domain

 

Dear students get fully solved assignments

Send your semester & Specialization name to our mail id :

help.mbaassignments@gmail.com

or

call us at : 08263069601

 

Leave a Reply