For this example, we do not need to set any. The identify function is a convenient method for marking points in a scatter plot. In R programming, box plot is a type of scatter plot. Example In this example, we need to generate a 100 random numbers and then plot the points in boxes. Then, we mark the first outlier with it's identifier as follows: > y <- rnorm(100) > boxplot(y) > identify(rep(1, length(y)), y, labels = seq_along(y)) Notice the 0 next to the outlier in the graph. [ 31 ] Data Mining Patterns Example The boxplot function automatically computes the outliers for a set as well.

The default value is List. initialization This contains NULL or a list of one or more of the following components: This contains the matrix. • hcPairs: This is used to merge pairs • subset: This is to be used during initialization • noise: This makes an initial guess at noise This contains which warnings are to be issued. Default is none. warn List of model names The Mclust function uses a model when trying to decide which items belong to a cluster. There are different model names for univariate, multivariate, and single component datasets.

Some approaches that are used to produce the density estimation are as follows: • Parzen windows: In this approach, the observations are placed in a window and density estimates are made based on proximity • Vector quantization: This approach lets you model the probability density functions as per the distribution of observations • Histograms: With a histogram, you get a nice visual showing density (size of the bars); the number of bins chosen while developing the histogram decide your density outcome Density estimation is performed via the density function in R programming.