Some common Machine Learning, Statistics and Data Science terms starts with F



### Word ### Description
Factor Analysis Factor analysis is a technique that is used to reduce a large number of variables into fewer numbers of factors. Factor analysis aims to find independent latent variables. Factor analysis also assumes several assumptions:
  • There is linear relationship
  • There is no multicollinearity
  • It includes relevant variables into analysis
  • There is true correlation between variables and factors

There are different types of methods used to extract the factor from the data set:

  1. Principal Component Analysis
  2. Common factor analysis
  3. Image factoring
  4. Maximum likelihood method|
    |False Negative|Points which are actually true but are incorrectly predicted as false. For example, if the problem is to predict the loan status. (Y-loan approved, N-loan not approved). False negative in this case will be the samples for which loan was approved but the model predicted the status as not approved.|
    |False Positive|Points which are actually false but are incorrectly predicted as true. For example, if the problem is to predict the loan status. (Y-loan approved, N-loan not approved). False positive in this case will be the samples for which loan was not approved but the model predicted the status as approved.|
    |Feature Hashing|It is a method to transform features to vector. Without looking up the indices in an associative array, it applies a hash function to the features and uses their hash values as indices directly. Simple example of feature hashing:

Suppose we have three documents:

  • John likes to watch movies.
  • Mary likes movies too.
  • John also likes football.

Now we can convert this to vector using hashing.

Term Index
John 1
likes 2
to 3
watch 4
movies 5
Mary 6
too 7
also 8
football 9

The array form for the same will be:

|Feature Reduction|Feature reduction is the process of reducing the number of features to work on a computation intensive task without losing a lot of information.

PCA is one of the most popular feature reduction techniques, where we combine correlated variables to reduce the features.

|Feature Selection|Feature Selection is a process of choosing those features which are required to explain the predictive power of a statistical model and dropping out irrelevant features.

This can be done by either filtering out less useful features or by combining features to make a new one.

|Few-shot Learning|Few-shot learning refers to the training of machine learning algorithms using a very small set of training data instead of a very large set. This is most suitable in the field of computer vision, where it is desirable to have an object categorization model work well without thousands of training examples.|
|Flume|Flume is a service designed for streaming logs into the Hadoop environment. It can collect and aggregate huge amounts of log data from a variety of sources. In order to collect high volume of data, multiple flume agents can be configured.

Here are the major features of Apache Flume:

  • Flume is a flexible tool as it allows to scale in environments with as low as five machines to as high as several thousands of machines
  • Apache Flume provides high throughput and low latency
  • Apache Flume has a declarative configuration but provides ease of extensibility
  • Flume in Hadoop is fault tolerant, linearly scalable and stream oriented|
    |Frequentist Statistics|Frequentist Statistics tests whether an event (hypothesis) occurs or not. It calculates the probability of an event in the long run of the experiment (i.e the experiment is repeated under the same conditions to obtain the outcome).

Here, the sampling distributions of fixed size are taken. Then, the experiment is theoretically repeated infinite number of times but practically done with a stopping intention. For example, I perform an experiment with a stopping intention in mind that I will stop the experiment when it is repeated 1000 times or I see minimum 300 heads in a coin toss.
|F-Score|F-score evaluation metric combines both precision and recall as a measure of effectiveness of classification. It is calculated in terms of ratio of weighted importance on either recall or precision as determined by β coefficient.

F measure = 2 x (Recall × Precision) / ( β² × Recall + Precision )|