Pages

Tuesday, 28 July 2015

DATA AND BASIC TERMINOLOGIES

                         DATA AND BASIC TERMINOLOGIES

Data is all around us. But what exactly is it? Data is a value assigned to a thing. Take for example the balls in the picture below.

What can we say about these? They are golf balls, right? First data point- they are used for golf. Golf is a category of a sport, so this helps us put the ball in a taxonomy(classification of things).

DIFFERENT TYPES OF DATA


MAINLY TWO TYPES


QUALITATIVE DATA: 


It is everything that refers to a quality of something, and it can't be measured. 

eg; colour, texture, feel of an object, the colour of Sky, a description of experiences etc.




QUANTITATIVE DATA:



 It refers to a number. It can be measured, compared.

eg; the number of golf balls, the size, the price , the score on a test etc. 


OTHER TYPES


CATEGORICAL DATA:



It helps to put the item you are describing into a category. In the case of golf balls "used" would be categorical(with categories such as "new","used","broken" etc.)


DISCRETE DATA:


It is numerical data that has gaps in it(whole numbers mainly). It is based on counts. Only a finite number of values is possible and the values cannot be subdivided meaningfully.
eg: number of golf balls(there is no such thing as 0.3 golf balls), shoe size, number of students in a class etc.


CONTINUOUS DATA:


It is a numerical data with a continuous range. It can have any numerical value and can be meaningfully subdivided into finer and finer increments.
eg: diameter of golf balls(10.356 mm), height of a person, size of your foot( as opposed to shoe size which is discrete) etc.


UNSTRUCTURED DATA(DATA FOR HUMANS):


"We have 5 white used golf balls with a diameter of 43 mm at 50 rupees each"- a plain sentence can be understood by human, but difficult to a computer. It is not machine readable.


STRUCTURED DATA(DATA FOR COMPUTERS):


Computers are inherently different from humans.If we want our computer to process and analyze data, it has to be able to read and process the data. It needs to be structured and machine readable form. One of the most common format is CSV( comma separated values).
eg: "quantity", "colour","condition","item", "category", "diameter(mm)", "price per unit"
        5, "white", "used", "ball", "golf", 43, 50



FROM DATA TO INFORMATION TO KNOWLEDGE:

ColourWhite
CategorySport – Golf
ConditionUsed
Diameter43 mm
Price (per ball)50 rupees
But each of the data values is rather meaningless by itself. To create information out of data we need to interpret the data.
Let's take the size; A diameter of 43 mm doesn't tell us much. It is only meaningful when it is compared to other things. In sports there are often size regulations for equipment. The minimum size of a competition golf ball is 42.8 mm. Good, we can use that golf ball in a competition. This is information.  But it is still not knowledge. Knowledge is created when information is learned, applied and understood.

No comments:

Post a Comment