Axiomatization of Language – Proposal

So, I came to a very important realization today about what makes a word overly-ambiguous (meaning, has an infinite number of potential definitions) that its definition is too broad. This reminded me of Russel’s paradox in set theory, which was the direct result of a lack of axioms, which enabled sets to be of unbounded size, so today, I will write about Russel’s paradox, it’s solution, and how this solution can be applied to a language. 

Russel’s Paradox in Set Theory:

Before the times of the Zermelo-Frankel axioms of set theory, which changed the face of mathematics, sets, which are the simplest collections of objects were defined through properties. For example, here’s a property – “is a cat,” (well I like cats), so now we can make a set of all cats. Or, we can take the property “likes the color black,” and make a set of all objects that like the color black. Sounds simple, right?? Awesome, here comes the fun part:

Let’s define a property that a set can have like so: Let’s say that a set X has the property D if X is not an object in X (itself). 

For example, the set {1,2,3,4} has the property D, because {1,2,3,4} is not an object in {1,2,3,4}. We can think about {1,2,3,4} as a box which contains 4 different numbers. Now, we want to know if this box (set) has the D property, which means that we want to know if the box contains itself. It doesn’t, and therefore, it has the D property. In general, nearly all sets that we can think of have the property D, so we can imagine already that the set of all sets with the D property is ridiculously large.

Treasure-Chest

In a more visual manner, this treasure chest contains treasure, but it doesn’t contain a treasure chest identical to itself in every way, from size to color, to material, and therefore, this treasure chest has the D property.

We defined a property, D, correct? So, now what are we going to do?? Simple – we’ll define a set R={X: D(X)=1}={X such that X has the D property}={X, such that X is not an object in X}

Now arises the question, does R have the D property??? There are only two possible answers to this question – yes or no. Let’s see what happens in every case:

  1. If R has the D property – Then R must be an object in R, because every object with the D property must be in R. On the other hand, since R has the D property, according to the definition of the D property, R isn’t an object in R. Since we’ve arrived at a contradiction, this option is definitely incorrect, so the remaining option must be correct. Let’s evaluate it:
  2. If R doesn’t have the D property – Then R is not an object in R, because only objects with the D property are in R. But, since R is not an object in R, then R must have the D property, because this is the definition of the D property. We’ve once again arrived at a contradiction, and therefore, this option is also definitely incorrect.

Since all of the possible options lead to a very clear contradiction, there must be a problem somewhere along the way. There were several suggested solutions, and of those, one of them stood out and was very widely accepted in the mathematical community  – the Zermelo-Frankel axioms.

The Solution to Russel’s Paradox – The Zermelo-Frankel Axioms

The primary idea behind the Zermelo-Frankel Axioms is that the set R from Russel’s contradiction is too large, and that in general, sets that are too large will cause contradictions. Therefore, the goal of the Zermelo-Frankel Axioms is to write a logical list of rules (axioms), which will enable us to create sets that are not ridiculously huge. 

Here’s the list of the necessary and trivial Zermelo-Frankel Axioms:

      1.  Axiom of Extensionality –
        Two sets are equal if and only if they share the exact same objects.
        \forall A,B, A=B if and only if, \forall x: x \in A if and only if x \in B
      2.  Axiom of Regularity (Foundation) –
        Every set A contains a set B, which is disjoint to A, meaning that A and B have no common objects.
        \forall A, \exists B \in A: x \in A if and only if x \notin B
      3.  Axiom of Restricted Comprehension (how to build the empty set) –
        \exists \O the empty set, which can be built using a trivial contradiction, like:
        \O={X: X!=X}={X: X isn’t equal to X}
      4.  Axiom of Pairing –
        For any two sets in the world, A,B, there’s a set C, which contains A,B as objects.
        \forall A,B, \exists C: (A \in C) and (B \in C)
      5.  Axiom of Union –
        The union of two sets is a set, when union is defined using the logical OR.
        \forall A,B, \exists C: C={x: x\in A OR x\in B}
      6.  Axiom of Intersection –
        The intersection of two sets is a set, when intersection is defined using the logical AND.
        \forall A,B, \exists C: C={x: x\in A AND x\in B}
      7.  Axiom of Power set –
        For any set, A, the collection of A’s subsets, which is known as the power set, is a set.
        \forall A, \exists C: C=P(A)={B: B\subseteq A}
      8.  Axiom of Minimal Set –
        For any set A, the set containing the singleton A is a set.
        \forall A, \exists B: B={A}

These axioms enabled mathematicians to define sets in a clear manner, without creating bizarre contradictions, such as Russel’s contradiction, by using the very basic idea that huge sets cause problems. 

How exactly can we view huge sets??? We can view them as sets that contain too many objects, and now, let’s use this analogy in the world of spoken language:

Why We Should The Zermelo-Frankel Fix to Spoken Language:

We can use the huge set idea from Russel’s contradiction in spoken languages, by saying that a word is like a set, and it’s objects is the subjects/words which are a part the category which this word creates. For example, we can take a word like “Religion,” which creates a category, which contains other words, such as “Christianity,” “Paganism,” “Islam,” “Shinto,” etc. Another example is the word “comfortable,” which creates a category, containing words and phrases, like “satisfactory,” “within 1 standard deviation from the optimal softness,” etc.

In a similar manner to Russel’s contradiction, a word, which creates a category with too many words and phrases causes many problems and contradictions. How does this work?? Let’s see this in an example, one of the words that I hate very much, which is the word “natural.” The word “natural” creates a category, which is too broad, because it contains words from “nature,” to “habit,” to “God’s will,” to “social norms,” to “what an individual feels comfortable with.” 

Did you catch those contradictions there??? Nearly every word here contradicts another. Let’s examine this:

  • “God’s will” vs. “Habit” – I will begin laughing like crazy here, because God’s will differs among Gods and religions, and habit depends on so many factors, from religion, to education, to intelligence type, to… I can go on forever here. Also, if we were to assume that there is a God (let’s assume the Jewish god, because I’m familiar with this from school, unfortunately), then, he’d want me to stay away from electricity on Saturdays…. Like I’d ever do that. I need my internet on Saturdays, and that’s my habit, so obviously, “God’s will” contradicts my habits, and I’m obviously not the only atheist in the world who’s bothered by “God’s will.”
  • “God’s will” vs. “What an individual feels comfortable with” – Let’s see, I feel comfortable baking my amazing chocolate chip vegan cookies with wheat flour during Passover, and “God’s will” is for me to stay away from wheat. So again, a clear contradiction, which is definitely not unique to me only.
  • “Social Norms” vs. “What an individual feels comfortable with” – This one is way too ridiculous, because exists something called “not mainstream,” which is already enough evidence to support the claim that social norms contradict what some individuals feel comfortable with. For example, one of the most important social norms in Israel is mandatory army service for all people from age 18 until age 21-24. Yet, there are lots of individuals who do not feel comfortable with the army service during these years, due to an understanding that the mandatory draft is characteristic of a fascist regime, pacifism, hating to wear the same thing everyday, and a bunch more reasons. These individuals feel comfortable not joining the army, which is clearly a contradiction to the social norm.
  • “Nature” vs. “What an individual feels comfortable with” – Most individuals feel comfortable with using a cell phone, a laptop, a television, and a bunch more technologies, which obviously do not exist in the Africans safaris or the tropical rain forests, which represent nature. So again, another contradiction. 
  • “Social Norms” vs. “Nature” – One of the most obvious social norms in the modern world is owning a car, a weapon which causes the death of many animals (in nature), exhumes greenhouse gases, which harm the atmosphere, which is crucial to nature, and harms natural resources in many other ways, so again, social norms contradict nature, not that I have any problem with owning a car. 
  • “God’s will” vs. “Nature” – Let’s be precise here, and say that God is the creation on mankind, and not the other way around. According to God’s (some person’s) word in the bible, the earth is some 4,000 years old. On the other hand, the earth, meaning soil, which is literally nature, has shown scientists through fossils, carbon sediments, and other tools, that the earth is precisely 4-5 billion years old. Tell me that’s not a contradiction (hahaha).

So, in short, we can see that words which create a category that’s too broad create contradictions, and therefore, we need to find some axiomatization of words, in order to get rid of broad categories. We are facing the exact same problem that set theory was facing with Russel’s contradiction, and therefore, we can try to use the same ideas behind the Zermelo-Frankel axioms in order to fix our current problem. The only question at the moment is how. I’ll be honest and say that I don’t have an idea at the moment, and therefore, I’ll think about it, and give an attempt (which may not succeed) in a future post. So until next time.

Advertisements

Correct Generalization – A Tribute to Victims of Discrimation

So, yesterday was Holocaust Day in Israel. In general, I really despise all of these types of grieving oriented holidays, since I don’t like having emotions forced on me, but this particular grief day is terrible in my opinion, since it lacks the very simple mathematical concept of Correct Generalization. 

What is Correct Generalization??

Mathematicians and humans in general HATE memorizing a lot of separate cases, and most people are really bad at memorizing unrelated facts. So, how do we solve this problem in all of the fields of science? We try to find a general rule in order to solve a very specific case. This general rule will enable us to solve all other cases with similar conditions. 

Let’s note that since correct generalization is CORRECT, the specific case must have useful properties, which imply other properties, such that we can find a common denominator between all of the cases, which enables the generalization. Basically, what I’m saying is that if a generalization isn’t correct, then it’s as bad as a useless definition. 

Now, enough talk – Time for some Scientific Examples:

In Mathematics:

Contrary to popular belief, mathematicians never use numbers (not even in number theory). Why? Simple – when we use numbers to solve  a particular equation, such as 4x^2-1=0, then we will only find a finite number of solutions (in this case, only 2). On the other hand, if we were to solve the equation (ax)^2-b^2=0, then we’d say that (ax)^2-b^2=(ax-b)(ax+b)=0, meaning, that the two solutions for the equation are x=b/a and x=-b/a. With this solution, we can figure out the solution to 6x^2-7=0 and 10234x^2-39847589=0 and the general solution for all a and b.

Once we know how to find a general solution to a certain equation, we can easily write computer code uses the general solutions to solve particular cases, which saves us a lot of time and calculation effort. Therefore, we’d obviously prefer to solve the general problem, spend about 2-15 minutes (depending on how skilled we are at programming) to write the code, and then, spend under a second calculating 100 different solutions, for 100 different a’s and b’s. 

Now, can you see why mathematicians have the most fun? Because they don’t waste their time on calculation (which we all hate), and instead, try to find a general case and throw the burdon of calculation on a computer. 

In Computer Sciences:

Since I’m a mathematican, and therefore, like nearly all mathematicians, am very lazy, I will keep on bringing that computer code that moves between number bases (also because I really like representations). So, here’s the code:

void int_base_find(int b, int n)

    int k=int((log(float(n)))/(log(float(b))));
    int r=n%b;
    int sum=0;
    int* arr=new int[k+1];
    for (int i=0;i<=k;i++)
    {
        int R=n%(int(pow(b,float(i+1))));
        sum=sum+r*(pow(b,float(i-1)));
        r=(R-sum)/pow(b,float(i));
        arr[i]=r;
    }
    for (int i=k; i>=0; i–)
    {
        cout << arr[i] <<” “;
    }
}

Let’s notice that this code takes an integer n (in it’s decimal representation) and represents it according to the base b. This basically means that it takes ANY integer n and represents it according to ANY base b, which saves us a lot of time and memory, because if we want to know the binary representation of 234123, the trinary representation of 8940845, the hexadecimal repreesentation of 234452435, and a bunch of representations of random numbers according to many different bases, we’d be able to use this code, and we won’t have to write a unique code for every particular representation. 

By the way, I bring up base representation quite a lot, because it’s very useful in minimizing the amount of memory a program takes up, which eventually enables our iPhone apps to answer nearly all of our questions in under a few seconds.

In Physics:

This is a slightly more interesting example, because at nearly every point in time, physicists have 3 options:

  1. Find a correct generalization (which you literally have to be Einstein in order to do so)
  2. Say that there’s no way that the two observed phenomena are related (which nearly no good physicist will say)
  3. Assume that there’s a correct generalization, since there probably is one, but accept that they cannot figure it out at the moment, and then, spend years trying to figure it out. Along the way, this physicist may think about many incorrect generalizations, but then realize that they don’t reflect their observations, and then, choose one of these three options again.

For example, after the apple fell from the tree, Newton made a correct generalization by figuring out the basic laws of Newtonian mechanics. Newton observed everyday phenomena, such as free fall (of the apple of course), projectile motion (which is a fancy word for horizontal throw with gravity), and slipping. His mathematical laws reflected the physical world that most people know very well, so scientists kept on working with them for a very long time.

Years later, scientists began to take interest in particles and other very small physical objects, which would travel at very high velosities, which were very close to the speed of light. These scientists observed that Newton’s laws were not valid in these cases of small objects (as in not visible to the naked eye) and high velosities, and therefore, they tried hard to figure out what are the laws which determine these objects’ motion.

After many years of trying to explain why classic mechanics weren’t valid in these situations, Albert Einstein gave a generalization which explained both the motion of large objects traveling at low velosities and the motion of small objects traveling at high velosities. His generalization is known by the very famous name of General Relativity, which is very important and useful. Also, your GPS can accurately state your location due to Einstein’s amazing discovery.

Now Let’s Look into Incorrect Generalization

As you may have guessed, an incorrect generalization is a generalization of a specific case to an entire set, when that specific case’s property isn’t even related to the entire set.

Now let’s formalize this. Let’s assume that we have a set S, and a property d.

Specific Case: exists x in S:  d(x)=1

General Case: for all x in S, d(x)=1

Here’s a very legitimate question – How did we move from the specific case to the general case?

Simple – we figured out that all of the objects in our set S have another property D, which enables them to possess the property d. 

For example, if S is a set of all of the dogs in the world, and d is the property “has a cardiovascular system,” then we can say that all x in S (dogs) has the property d, since they possess the property D=”is a mammal,” and all mammals possess the d property.

What exactly did we do? We said, S is a subset of {x: D(x)==1} (meaning mammals), which have the property d. So, in order to generalize, we looked for a superset, and used the existence of that property in the superset, which implied the existence of that property in the subset.

Now, Let’s Examine an Incorrect Generalization: Discrimination

What’s discrimination?? Before we can define discrimination, we have to define some sets:

  • Neuro – the set of all objects in the world with a neurological system
  • P(Neuro) – the set of all of the subsets of objects with a neurological system. For instance, Animals, People, Men, Lawyers, etc. are all objects in P(Neuro)
  • M(t)={r \in Rights: \forall x \in Neuro, P(r \in Rights(x) at time t)>=0.8} – meaning the set of all of the rights that over 80% of the objects in Neuro possess at a certain time t


  1. All of the objects in B have a certain property that they cannot change in a continuous fashion and they didn’t choose. For instance, if B=African-Americans, and we take d=”is African American,” then d is a property that cannot be changed nor chosen.In a formal way, here , exists d: \forall b \in B, d(b)==1 \wedge, ~(exists f\in C(Time): f(d(b))==0
  2.  Exists a right r in M, such that A takes r away from every b in B, which still possesses that right, exists r \in M, \exists f:A–>(Time,Algorithm) \exists g=f(A,B): \forall b \in B, \forall t\in Time: t>min(time,g), r \notin (Rights(x,t)

Why is this a bad generalization? Because the property d, which all objects in B possess isn’t a useful property, since it cannot be changed nor chosen, and therefore, this property d cannot imply other non-trivial (meaning that are not d) useful properties. 

For instance, humans discriminate against animals, by locking them up in cages and eating them, only because they possess the property “is an animal, which isn’t a human.”

Despite the fact that discrimination is exactly what a bad and incorrect generalization is, we can still get a correct generalization in this terrible crime against Neuro – This is the definition of discrimination, which refers to how one general set takes away the rights of another general set. This definition doesn’t say that discrimination is the Holocaust, or slavery, or not letting gays marry, or not letting women vote, rather, it explains that these are all specific cases of the vicious crime of discrimination.

This is a very important generalization, because now, we can say discrimination is bad. period., instead of saying what most people say, which is, “It’s okay that I believe that all Arabs should die, despite believing that the Nazis did something absolutely wrong,” or “It’s okay for me to support businesses that test their products on animals, though I’m pro gay marriage,” which are absolutely hypocritical statements, because these statements support one form of discrimination, but oppose another.

So the good generalization here is that discrimination of any form, between any two subsets of Neuro is a crime, and those who commit this crime should be severely punished. If all of the “subset of Neuro rights” organizations (such as women’s rights, children’s rights, animal rights, etc.) would understand this generalization, then we could definitely abolish this disgrace from our world.

Wow… that was a pretty political post, which shows us that mathematics can be applied absolutely anywhere, and through mathematical concepts, such as correct generalization, we can find ways to work faster, work more efficiently, minimize suffering, and in general (hahahaha… saying general), make the world a much better place.

Axiomatization of Legal Documents

Here, I give an axiomatization of all legal documents, using mathematical language. This is a guide about how to write a legal document with as few loopholes and lacunas as possible.

A Slightly More User Friendly Post – Some (Real) Math Problems and Ideas That Get Your Mind Running

Obviously, some of the tough and deep math may not be fit for every person, so I decided to write a more user friendly post, in which I’ll discuss some math problems that you can bring up at a non-math related dinner or something (I’m speaking from experience – people enjoy hearing these problems and ideas). These problems and ideas will hopefully change people’s antagonistic nature towards math.

Beginner: A Cute Problem in Combinatorics/Algebra:

I know that 2^64 is a 20 digit number. Don’t ask me how I know – I just know. Now, I’ll ask you a question: Does it have a digit that repeats itself 3 times? (Hint: A fifth grader can solve this problem).

Mind Blowing Idea: There are many types of infinity, some are larger than others:

We’ve all been told that infinity is a number which is larger than every number in the world. Not true. There are many different types of infinity, as a matter of fact an infinite number of infinities (haha.. but which one). That’s for a different post – but anyways, we have 2 significant and interesting types of infinities, which are א and א0, when א0<א. What do these hebrew letters represent?

א0 – the number of natural numbers, meaning, positive whole numbers 

א – the number of real numbers, meaning, natural numbers, rational fractions, and irrational numbers.

Now here’s the challenge – proove or at least think about why  א0<א. 

Beginner: The Monty Hall Problem (a basic probability and logic question):

I’ll copy the description from wikipedia: “Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?”

The solution here is quite tricky, so try to be objective and logical. Think probability!

Beginner: For those of you who love to drive (like me):

If you love to drive, then you may like to drive fast, or slow, but what does that even mean? What exactly is speed? For those of you who love to drive stick-shift (like me), you’ll probably know what I mean when I say that I enjoy the rush of accelerating, pressing on the clutch, switching the gear, and then slowly lifting the clutch. Now what exactly is acceleration and de-acceleration? How can we represent speed and acceleration using only time and distance?

Also, if you love to drive or love the environment (again like me), then, you probably check your fuel efficiency. What exactly is fuel efficiency anyways? And why is driving very fast and breaking suddenly not fuel efficient?

Mind Blowing idea: 16=10000=121?

So, I lied a little bit. 16 is the representation of the number sixteen according to base 10, meaning, sixteen=1*10^1+6*10^0=10+6. 10000 is sixteen’s binary representation, meaning sixteen=1*2^4+0*2^3+0*2^2+0*2^1+2*2^0=2^4. 121 is sixteen’s trinary representation, meaning, sixteen=1*3^2+2*3^1+1*3^0.

Now, here’s a little something to think about: How would all of our arithmatic rules change if we were to represent a number according to a different base? 

For the sharper minds, here’s an even more interesting question: how do you move between different base representations (do not read the algorithm I gave in the translation post)?

Intermediate: Risky Business:

How do we define risk mathematically? Why are some activities or decisions considered to be riskier than others?

Mind Blowing Idea: Discretization of The Continuum TIme:

 Time is continuous, and this is why we can experience every single moment. This basically means that between any two given times, t1, t2, of any given interval, meaning |t2-t1|=δ seconds, for any infinitely small delta, theres a new time, t, such that t1<t<t2. This is called the completeness axiom of real numbers. 

Yet, why is it that every single time that we measure time, we use a discrete measure, such as hours, years, minutes, seconds, miliseconds? Like, we never say, “I’m 24.293847983275+pi years old,” or “I can be at the office in 15.23452 minutes,” or “I can be at the office in 0.23488503 hours,” “I updated my twitter status 2.23405849843 seconds ago.” Instead, we give natural number times, such as 27 years old, 15 minutes, 2 seconds, etc. Even scientists use miliseconds, microseconds, or nanoseconds to measure, instead of using extremely long decimal representations.

So, how should we view time, as discrete or continuous? Let’s note that discrete math, such as graph theory, set theory, and algebra, is very different from continuous math, such as calculus, analysis, geometry, and topology.

Intermediate: Prove that there’s an infinite number of primes:

This sounds quite threatenting (so weird that I’m putting this in intermediate, right?), but Euclides was able to do this, and quite frankly any sixth grader can think up a proof. (Hint: Start with, let’s falsely assume that there’s a finite number of primes).

Mind Blowing Idea: Poincare’s Conjecture (proved my Grigory Perlman):

Poincare’s Conjecture (topology) states that every three dimensional space without holes can be blown into a sphere. For example, we can blow a pyramid up into a sphere if we get rid of the edges by blowing it up. Just try to visualize it. Insane!

This was an open question for nearly 100 years, and was even a millenium problem, which is a set of open questions in mathematics, and those who can prove them will receive a 1 million dollar prize. The most famous millenium problems are the Riemann Conjecture (number theory) and P vs. NP (computability). In 2010, Russian mathematician Grigory Perlman prooved Poincare’s Conjecture, but refused to accept the prize money and withdrew himself completely from the mathematical community.

Intermediate: Boys Know Girls Who Know Boys:

In my combinatorics class (this is a combinatorics question), there are 20 boys, each one of them knows exactly 4 girls, and every girl knows exactly 5 boys. How many girls are there in my combinatorics class?

Mind Blowing Idea: Representation:

This is one of my favorites, and actually one of the first things that blew my mind away when I was introduced to the math world.

So, remember the base representations that I wrote about earlier? Well, there are many different types of representations, such as:

  1. Representing a complex number as the sum of a real number and an imaginary number:  z=a+bi, when a,b are real
  2. Or more generally, representing a number as the sum of a rational number and a rational number times a root, such as,  z=p+q\sqrt(d), when p,q are rational, and d is whole (not necessarily natural, and when d=-1, then, \sqrt(d)=i).
  3. Representing a natural number as a multiplication of prime numbers
  4. Representing a vector as a sum of unit vectors. When we change the unit vectors, the representation also changes, according to a base change algorithm, meaning, that in 2-dimensional space, we can write: v=a(e1)+b(e2)=c(u1)+d(u2), when {e1,e2} is a base for R2 and {u1,u2} is a base for R2, meaning every vector has a represention like the one above. The smart ones can begin to think of a base change algorithm.
  5. Representing a continuous function as a sum of polynomials (Taylor sums) or exponents (Fourier sums).

Advanced: Cute Number Theory Question For the Clever Ones:

Actually, this question was on my number theory homework, but it can be solved without any knowledge of number theory:

Calculate the one’s digit of the number 27^(27^(27^27)).

The solution is quite tricky, but try to look for some rule and understand what exactly is the one’s digit of a number.

Mind Blowing Idea: Search:

If we have a sorted array consisting of 10,000 values, which we cannot see, and we want to know of one of those 10,000 values is 532, what’s the most efficient way to do so?

Now if that was easy, think about what’s the most efficient way to sort 10,000 random numbers. These are well-known basic algorithms in computer sciences.

Advanced: Ramsey’s Question:

Let’s assume that you can have one of two relationships with people in the world: friends or strangers, meaning, that for any given person in the world, you’re either their friend or you don’t know them, and then you two are strangers. We want to put n people in a room, such that we have 3 people, who we’ll call i, j, k, who are all friends (meaning i and j are friends, j and k are friends, and i and k are friends) or who are all strangers. By this, I mean that they were friends or strangers before they entered the room – what happens inside the room doesn’t interest us. What’s the minimal n that is required in order to ensure this?

The super sharp people can try to think about what’s the minimal n required to satisfy having m people (not 3, rather a general m) who are all friends or strangers. If you can solve this, you should publish a paper, because there currently isn’t a solution for a general m.

Link

So…. how do I begin? I opened a new facebook page for the cause, which is raising awareness about the lack of definitions in the world and trying to fix definitions and defining things in the world using mathematical symbols. So, I’d be very happy if people would come join the group, and here’s the link:

http://www.facebook.com/pages/Definition-in-Life/553370178027890?ref=tn_tnmn

Once more people will see the problem and understand it, then it’s possible that we could begin moving things, such as changing legal contracts, languages, etc. 

Thanks and Happy Pi day!!!!

Good Measurements

In most of my posts, when I describe a good definition, I usually give a numeric measurement for that definition, such as “someone is being loud if they’re talking over X decibels at Y frequency.” Nοw arises the question, what’s a good measurement? In order to answer that, we’ll discuss certain types of measurements, such as distance (metrics and norms), measures (part of measure theory, which I have not yet studied, and therefore will not discuss now), cardinality, length, and other measurements both in math and in the real world. 

Before this, I’ll say that a good measurement is a measurement which can be interpreted in one way only, meaning that M (over a set X) is a good measurement if it’s a function, meaning:

M: X→R (or some other set of numbers)

\forall x \in X, \exists! y \in R: M(x)=y

when \exists! means exists one and only.

So, let’s start simple, by discussing cardinality:

Cardinality

Definition – In finite sets, cardinality is the number of objects in the set. For example, the cardinality of {0,1,2,3,4} is 5. The cardinality of a set A can be written |A|. In infinite sets (and in general, it’s a nice exercise to check this for finite sets), we say that |A|=|B| if exists an invertible function f:A→B. For example, |N|=|Z| (N- natural numbers, and Z- whole numbers).

When do we use Cardinalities in Real Life – Cardinalities are very useful in counting how many objects have a certain property, such as C=|{tests that Noy has during June 2013}|. Very simple, right? Cardinalities are simple, because sets are the most basic object in mathematics. 

Let’s go on to something more difficult and less intuitive – metrics and norms:

Metrics and Norms

Metrics are a way to measure distance in a metric space, and norms are a way to measure size in a normic space. Let’s define these objects and discuss their applications.

Metric (Definition): Let X be a set, and d be a function d:X×X→R (real numbers). d measures distance if:

  • Non-Negativity:

This means that \forall x,y \in X, d(x,y)≥0 and d(x,y)=0 ↔ x=y

*The double sided arrow is if and only if (meaning that A (if and only if) B, means that A implies B and B implies A).

  • Symmetry:

This means that \forall x,y \in X,  d(x,y)=d(y,x)

  • Triangle Inequality:

This means that \forall x,y,z \in X,  d(x,z)≤d(x,y)+d(y,z)

Now how does this definition fit with the definition of distance? Simple. Well, the non-negativity and symmetry are pretty obvious. The triangle inequality is the direct result of the Pythagorian theory (x^2+y^2=z^2), and that’s why it’s a triangle.

Let’s think of some examples of metrics both in math and in real life:

1. Lp metrics – d(x,y)=(∑(|xi-yi|)^p)^(1/p) The Lp metric is different for every p that we choose, and every p is good for different spaces. For instance, the L1 metric (also known as the taxi-cab metric, because you’re a cab driver, who is limited to driving on the streets)

L1 (taxi cab metric)

is good for a space made of discrete lines, whereas the L2 metric (also known as the Euclidean metric) is usually used for measuring distance, both in math and in real life. For example, the distance that a plane travels from Paris to New York is measured using the L2 metric. 

2. The Discrete Metric – d(x,y)={0; if x=y,  1; if x≠y} The discrete metric is very useful in recognition/self/grouping types of spaces. For example, if we have a beginner’s debate club (in which people don’t know how to argue for issues that they don’t agree with), and we’d want the members of the club to divide into groups which will argue for or against something – let’s say the death penalty. So, in order to group these people, we’ll ask every member of the club to answer a poll, which asks, “Are you for or against the death penalty?”, and the possible answers are “for,” “against,” and “unsure.” Then, we’ll define our set X to be all of the answers that people gave (this set will usually be finite). Then we’ll divide people into debate groups according to their answers, meaning, we’ll write x’s (person) equivallence class like this: [x]={y \in Club: answer(x)=answer(y)} or using the metric, [x]={y \in Club: d(answer(x),answer(y))=0}. Conclusion: We can write express equivallence relationships using equivallence relationships on metrics.

3. The Graph Metric – This metric is defined as the shortest distance on a graph between x and y. The Graph Metric is very useful in many real life problems, such as social/business networking. For instance, if we have an idea for a start-up company, and we need sponsors, but we don’t have very much time, so we cannot sent emails to all of the potential sponsors in the world. Let’s assume that the richest and most powerful sponsors are too busy to read our emails, and won’t do so, but we want their support. In order to do that, we’ll contact less powerful sponsors, who can refer us to some of their higher ranking business partners, who can continue referring us to more powerful people, until we get to the most powerful sponsors. Now, arises the question what’s the shortest possible path to the richest sponsors, or in other words, if we were to define X to be the set of sponsors who have time to listen to our ideas and P as the set of the most powerful sponsors, what’s min{d(x,p): x \in X, p \in P}? 

Definition (Norm) – given a metric space V, which must also be a vector space, the norm of an point in the space is the distance between that point and the 0 point of that space, which is sort of like the length of the vector connecting that point and the 0 point. A norm is a function ||.||:V→R, that is:

1. Non-negative:

\forall v \in V,  ||v||≥0    and  ||v||=0 if and only if v=0

2. Homogenous:

\forall v \in V, \forall a \in R,  ||av||=|a|·||v||

3. Triangle Inequality:

\forall u,v \in V,  ||u+v||≤||u||+||v||

How can we use norms in real life and in math??

1. The Lp Norms – The Lp norms are a direct result of the Lp metrics, and vice versa, so we can use the Lp norms in the same manner that we use the Lp metrics. For example, let’s assume that we want our GPS to always give us the distance from our home to work, the store, school, restaurants, bars, etc., then, the GPS, is actually calculating the L2 norm (or the L1 norm if we’re driving on the blocks, and cannot take shortcuts), when the 0 point is home. 

2. Inner Product Induced Norms – Given an inner product, we can define a norm to be              ||x||=(<x,x>)^0.5. Τhis norm is very useful in determining how far a function or polynomial (these norms are often used on function or polynomial spaces) is from the 0 function or polynomial, thus giving a partial order on the functions. For example, if a we had a drunk driving a car with paint covered wheels (the guy is drunk, so that he’ll drive in random directions at random speeds, and the painted wheels is for us to be able to determine how he drove), and we’d want to see what was his average driving speed, then, we’d have this drunk drive (do not try at home!) this car for one hour (until he’ll pass out from alcohol poisoning), and then stop. Afterwards, we’d use the paint marks that the car left on the asphalt to write a function for the driving path, and then, we’d use the inner product on the function space in order to figure out the distance (norm) that the drunk drove in an hour. After that, we’d calculate the average speed by taking this distance (the length of his path) and dividing it by the time (one hour), in order to get that he drove an average speed of (<x,x>)^0.5 kilometers per hour.

And here’s a little exercise for the sharpest of you all: Show that all of these measurements are good measurements. 

Now we can see how measurements come from math, what are some good measurements, and how to catagorize measurements. So now, arises the obvioius question: what’s a bad measurement?

A bad measurement is a measurement that isn’t a function (meaning is ambiguous), and now we’ll define it using the negation of the definition of a good measurement.

Good measurement: \forall x \in X, \exists! y \in R: M(x)=y

Bad measurement: \exists x \in X: \exists y≠z \in R:  M(x)=y and M(x)=z

This is really really bad, because we can see here that M(x) has 2 different values (at least), which is something terrible. It’s exactly like if I were to say that I am both 20 years old and 15 years old, which is terrible, because according to the law, I’m allowed to buy cigarettes, but I’m not allowed to buy cigarettes (not that I smoke, but I may need to buy cigarettes for a friend or something), and I’m both allowed to drive a car and not drive a car (despite the fact that I have a driver’s license), and I’m both have to pay taxes and not pay taxes, and I both have to live with my legal guardian and am able to live on my own, and I’m both allowed to vote and cannot, and I can both get a tattoo and cannot, and so on and so forth. So basically, I really don’t know what I can and cannot do if I’m both 20 years old and 15 years old, and I’ll have to live in fear that if I’ll drive a car (because I’m 20 years old), I’ll be arrested (because I’m 15 years old. By the way, I’m 20 years old.

Now let’s bring up an example of a bad measurement: IQ tests as a way to measure intelligence. 

As a part of my research for this post, I tried an IQ test online at IQTest.com. While taking the test I realized the following problems with this test, which make it a bad measurement for intelligence:

1. The questions are true/false, such that you have a 50% chance to get an answer that you have no clue about correct. The test consists of 38 questions, such that someone who can only answer 15 of these questions correctly (apparently not intelligent according to this test), has a 3x(10^-5) chance of scoring over the 95% on this test.

 2. The questions tested primarily mathematical skills and counting/grammar accuracy, such that someone who may be very intelligent in other areas, but isn’t good at math will not be considered to be intelligent according to this test. I’ve always had high scores at these types of tests, since I’m good at math, though I know that if these tests were checking something else, like reading skills, then I’d fail miserably. Here are some of the questions (all true/false). I’ll comment about these questions in pink:

  • 27 minutes before 7 o’clock is 33 minutes past 5 o’clock. Seriously??? This only checks how accurately someone reads, and some basic addition/substraction (done without a calculator). Someone may fail at this question, simply because they aren’t very accurate. I don’t think that accuracy is a clear sign of intelligence, though it can help someone pass these sort of tests.
  • Fred will be four blocks from his starting place if he travels two blocks north, then three blocks east, and then two blocks south. What if we aren’t good at directions? Does this mean that we aren’t intelligent? Also, who uses north, south, east, west nowadays?
  • Nine chickens, two dogs, and three cats have a total of forty legs. Some people don’t know what these animals are, and some people have only seen 3 legged cats. Also, once again, this question only checks how accurate of a calculator someone is, not really how likely they are to succeed in life.
  • Sixteen hours are to one day as twenty days are to June’s length. Again, why all of the math questions? Calculation accuracy isn’t synonymous to intelligence, so why use it to measure intelligence? Einstein wasn’t good at manual calculations, yet, he’s considered to be one of the most intelligent people ever.
  • If the word, “TAN,” is written under the word, “SLY,” and the word, “TOT,” is written under “TAN,” then the word, “SAT,” is formed diagonally. Again, someone who’s good at math is more likely to get this question correct that someone who isn’t. 
  • This sentence has thirty-five letters. This measures nothing, besides how accurately someone can count under pressure. I don’t really think that counting accuracy under time limitation is a sign of intelligence
  • The number 64 is the next logical number in the following sequence of numbers: 2, 6, 14, 30… Again, total math question. Someone who’s good at math would succeed at this question, whereas someone who isn’t won’t.
  • The sum of all the odd numbers from zero to 16 is an even number. Why the math questions? There’s a shortcut to the answer (for those of you who are good at math). Again, people who may be at the top of their field in some non-math/science area will not figure out this shortcut.
  • If each of seven persons in a group shakes hands with each of the other six persons, then a total of forty-two handshakes occurs Come on. This is a question in combinatorics, which anyone who has ever taken a statistics/combinatorics course will succeed in, whereas people who haven’t will have to try hard. So now we see that this test also depends on the person’s education.
  • If a doughnut shaped house has two doors to the outside and three doors to the inner courtyard, then it’s possible to end up back at your starting place by walking through all five doors of the house without ever walking through the same door twice. Graph theory question? So this test is trying to say that only mathematicians are intelligent. 
  • If the word, “quane,” is understood to mean the same as the word, “den,” then the following sentence is grammatically correct: “Looking out from my quane, I could see a wolf enter quane.”  Again, another question that mathematicians/scientists will be better at, since they’re used to working with varied inputs (in this case, “quane”), so once again, the intelligent humanities majors will have a clear disadvantage.
  • The words, “auctioned, education, and cautioned,” all use the exact same letters. This only shows how accurately someone can rearrange these letters.

3. This test was written for all people of all cultures, ages, races, educational levels, etc., yet, it’s obvious that people at different ages and different educational levels will score differently. Also, some of these questions include animals, which some people are not familiar with. Therefore, we have to be more accurate about which social subsets this test is inteded for.

All of the questions were very similar to this, so as we can see, there’s a clear reoccuring theme of math+accuracy=intelligence. Obviously, math and accuracy are not the only indicators of intelligence, and there are many other types of intelligences, such as interpersonal, linguistic, bodily, spacial, musical, etc. In addition, people who received very high scores on this test may not be intelligent in many other areas. For example, according to this test, I have a 185 IQ (I don’t believe this at all. I’m not more intelligent than most people, rather I’m simply good at recognizing patterns and applying basic math skills, which are skills that I aquired as a mathematician), despite the fact that I find reading (in all natural languages) to be very difficult, I am very clumsy, I cannot discern between pitches (despite being a music theorist), I have a problem concentrating in slow paced subjects, I have no sense of direction, I have a terrible memory, and I had very low grades in high school in nearly every subject. Basically, I fail in all areas of intelligence, besides mathematical, and despite this, I’m still considered to be a super-genious according to this test.

So why do people use these IQ tests as an indicator of intelligence? I have no idea. Now, how can we improve this measurement? Not so simple:

1. Make this test be a written test, in order to get rid of all of these chances that someone will get a high score only from guessing correct answers.

2. Include questions that people who are not necessarily good at math will succeed at, such as reading comprehension questions, drawing related questions, story writing, etc. For those of you interested, all of the types of intelligence are mathematical, spacial, linguistic, musical, bodily, intrapersonal, interpersonal, naturalistic, and existential. You can read about that here

3. Include questions such as, “how do you think an intelligence test should be written,” in order to check someone’s creativity, since creativity is a very clear part of intelligence.

4. Make different IQ sets for different age groups.

 Ok, IQ tests are flawed. Now is that so wrong? Yes, it is, since these flawed IQ tests are used to determine whether a child is “normal,” “mentally retarded,” or “genious.” Catagorizing young children with this IQ test can cause many problems, such as low self esteem (if the child is considered to be retarded or normal), inappropriate treatment, academic pressure (if the child is considered to be a genious, then he’s more likely to feel pressured to get high grades in subjects that he may not enjoy or may not truly excel in), and this list can go on and on forever. These disadvantages are present in nearly all poor measurements, and therefore, it’s crucial that we make sure that all of the measurements that we use in life, from IQ tests to building dimensions to time are all functions that are well defined and well executed.