At 12:44 AM 10/10/2006, Ramakalyan wrote:
>I have a large data (typically, several million records) of non-negative
>numbers. I want to sort this data into 5 bins -- even, odd, prime, rational,
>and irrational. I am given samples of these classes, i.e., each bin is given
>along with a few numbers in that class. For the rest of the numbers, the
>constraint is that I should not make use of the properties of these classes
>(for instance, I cannot use the fact that an odd number when divided by 2
>always leaves a remainder of 1); instead I should compare with the samples
>and
>somehow *learn* to identify the classes.
>
>Is there any way I can do this using purely statistical techniques?
Your question is not well-posed. All of the classes you specify are
properties that are related to the field properties of the rational or
irrational numbers. There are no other properties that can be inferred from
the symbols of these numbers, if you exclude their defined properties.
"1", "2", "3", etc. are purely arbitrary symbols assigned to numbers. Any
other symbols could also have been used. These symbols convey no useful
information.
If you are trying to make a "toy" problem, you must supply some source of
information that is correlated to the property at hand.
At a minimum, you must allow the ordinal property of integers. That is the
definition of the symbols used. From this one could infer statistically
"evenness" and "oddness" without recourse to division.
How you would extend this to rational numbers is a much larger problem.
The irrational number problem is impossible. I suspect it would difficult
for you to even provide a representative training set. (And don't forget
that only rational numbers can be represented on a computer.)
================================================================
Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral@...
Least Cost Formulations, Ltd. URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239 Fax: 757-467-2947
"Vere scire est per causas scire"
================================================================