Sign In
New User? Register
Statisticians_group
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
You can search the group for older messages.

Messages

  Messages Help
Advanced
Messages 3420 - 3449 of 4077   Newest  |  < Newer  |  Older >  |  Oldest
Messages: Show Message Summaries   (Group by Topic) Sort by Date v  
#3449 From: Statistician Statistician <martinholt42@...>
Date:: Sun May 10, 2009 12:58 pm
Subject:: Statistician: Statistician Statistician has sent you a friend message
martinholt42
Offline Offline
Send Email Send Email
 
Hey Statistician,

Just a reminder that Statistician S just sent you a new friend request.

Click below to view your message and reply:
http://www.flixster.com/invite/pending?em.id=518977315

We'll talk later,
Flixster



If you prefer not to be notified about new friend requests from Flixster members, click here.

#3448 From: "predictionimpact2" <elise@...>
Date:: Sun May 10, 2009 7:38 am
Subject:: Predictive Analytics Seminar: May 27-28, New York City
predictionim...
Offline Offline
Send Email Send Email
 
Hi all,

I wanted to let you know about our training seminar on predictive analytics -
coming May, Oct, and Nov in NYC, Stockholm, DC and other cities.  This is
intensive training for marketers, managers and business professionals to make
actionable sense of customer data by predicting buying behavior, churn, etc. 
Past attendees provided rave reviews.

Here's more info:
----------------------

Training Program: Predictive Analytics for Business, Marketing and Web

A two-day intensive seminar brought to you by Prediction Impact, Inc.

Dates: May 27-28, Oct 14-15, Oct 18-19, and Nov 11-12, 2009
Locations: NYC (May), Stockholm (Oct), DC (Oct), San Francisco (Nov)

93% rate this program Excellent or Very Good.
**The official training program of Predictive Analytics World**
**Offered in conjunction with eMetrics events**

Also see our Online Training: Predictive Analytics Applied - immediate access at
any time: www.predictionimpact.com/predictive-analytics-online-training.html


ABOUT THIS SEMINAR:

Business metrics do a great job summarizing the past. But if you want to predict
how customers will respond in the future, there is one place to turn--predictive
analytics. By learning from your abundant historical data, predictive analytics
provides the marketer something beyond standard business reports and sales
forecasts: actionable predictions for each customer. These predictions encompass
all channels, both online and off, foreseeing which customers will buy, click,
respond, convert or cancel. If you predict it, you own it.

The customer predictions generated by predictive analytics deliver more relevant
content to each customer, improving response rates, click rates, buying
behavior, retention and overall profit. For online applications such as
e-marketing and customer care recommendations, predictive analytics acts in
real-time, dynamically selecting the ad, web content or cross-sell product each
visitor is most likely to click on or respond to, according to that visitor's
profile. This is AB selection, rather than just AB testing.

Predictive Analytics for Business, Marketing and Web is a concentrated training
program that includes interactive breakout sessions and a brief hands-on
exercise. In two days we cover:

- The techniques, tips and pointers you need in order to run a successful
predictive analytics and data mining initiative

- How to strategically position and tactically deploy predictive analytics and
data mining at your company

- How to bridge the prevalent gap between technical understanding and practical
use

- How a predictive model works, how it's created and how much revenue it
generates

- Several detailed case studies that demonstrate predictive analytics in action
and make the concepts concrete

- NEW TOPIC: Five Ways to Lower Costs with Predictive Analytics


No background in statistics or modeling is required. The only specific knowledge
assumed for this training program is moderate experience with Microsoft Excel or
equivalent.

For more information, visit
www.predictionimpact.com/predictive-analytics-training.html, or e-mail us at
training@....  You may also call (415) 683-1146.

Cross-Registration Special: Attendees earn $250 off the Predictive Analytics
World Conference

SNEAK PREVIEW VIDEO: www.predictionimpact.com/predictive-analytics-times.html

$100 off early registration, 3 weeks ahead

#3447 From: Andrew Hartley <khahstats@...>
Date:: Sat May 9, 2009 9:59 pm
Subject:: Re: Re: Testing Normality of biological variables
khahstats
Offline Offline
Send Email Send Email
 
Ramesh & Madan,
1. testing normality: no distribution is absolutely normal; normality is merely an idealization that makes life mathematically convenient for us statisticians. What you want to know is whether the population generating the data is close to normal, not whether it is absolutely normal.
 
Therefore, if you test the hypothesis H of normality using a large enough sample from any given population, you will reject H, no matter how small is your type 1 error rate alpha. Therefore, even though increasing alpha to 10% (as Madan suggests) may be appropriate for small sample sizes n, for large n an alpha of even 0.01 may be too large.
 
If one insists on testing H, then to proceed rationally one would need to choose the alpha & the sample size so as to have a low probability of rejecting H if H is approximately true, & a high probability of rejecting H if it is extremely false. This involves several difficult steps, which I could describe in more detail, but suffice it to say that I agree that the graphical approach Magan described is usually much more practical, unless the consequences of making a wrong decision between H & -H are really really bad.
2. Central Limit Theorem: I suggest you explore the tendency toward normality yourself. Generate some severely non-normal data sets & observe how the means from those sets tend towards normality. I have been surprised at how quickly the means begin to have a nice symmetric bell-shaped curve. It's really surprising.
 
3. Bonferroni: I don't understand "decreasing the p-value" & "increasing the possibility of declaring the data non-normal" below. In any case, Bonfereroni's approach is generally regarded as an overly rigid & limiting approach to multiple testing. It sets the bar very high for rejecting the tested hypotheses, by markedly reducing the test-wise alpha, which can reduce power more than many people think is reasonable. My opinion is that we usually cannot tell whether the bar is set too high, because we hardly ever identify the losses associated with type 1 & type 2 errors if they occur (if a type 1 error would lead to a huge loss, then you would want to make the probability of it very small). With some loss functions, I'm sure Bonferroni would do quite nicely, but until we choose a loss function we just never know.

--- On Sat, 5/9/09, Madan Gopal Kundu <madan4331@...> wrote:

From: Madan Gopal Kundu <madan4331@...>
Subject: [Statisticians_group] Re: Testing Normality of biological variables
To: Statisticians_group@...
Date: Saturday, May 9, 2009, 2:01 PM

Hi Ramesh,

The problem you stated regarding normality test by KS test is genuine and this is true for other normality tests as well. The moral is we should not decide normality of a data only just looking at p-value of normality tests.

To overcome this we should follow any of the following strategies:
1. First increase the significance level i.e. instead of using 5% use 10% significance level.
2. Use Quantile-Qunatile (Q-Q) plot to decide normality.

I think the second option is better and this is available with standard softwares such as SAS. A nearly straight line in Q-Q plot indicates normality. A significant deviation from straight line indicates non-normality. An alternative of Q-Q plot is Probability - Probability (P-P) plot.

Coming to your second question, I strongly discourage you to use Bonferroni methods or other similar methods to adjust multiplicity issue. Use of Bonferroni method ultimately decreases p-value and which in turn increases the possibility of declaring data as non-normal. So DON'T use it.

Hope this helps you.

Thanks & Regards
Madan Gopal Kundu

--- In Statisticians_ group@yahoogroup s.co.in, "Ramesh S.Ve." <optoramesh_ gp@...> wrote:
>
> Dear Friends
> I was working on one of our data related to anatomical measurements of the nerve size & area in the eye... Our sample size is 450 subjects, surprisingly when each parameter was tested for normality showed siginificant (p<0.05) non normal distribution as results in Kolmogrov Smirnov. Though the central limiting theorem suggests that a sample greater than 30 would be normal, bio statistical variables are usually not so.... (I would like to know how many of you would agree).... Shall we continue performing a parametric test such as t test, anova or should we opt for non parametric alternatives in analysisng these data sets.... Kindly advice:
>
> 1. Better methods to test for normality
> 2. Like bonferroni correction for multiple t test is there a correction factor (adjusting the significance level of p value) when 4 or 5 variable are tested together using independent KS test
>
> Thanks in advance
>
> Regards
> Ramesh.S.Ve
> PhD student
> Sankara Nethralaya
> 18, College Road
> Chennai-600006.
> Tamil Nadu
> India.
> Tel: 91-44-28271616
> Fax: 91-44-28254180
> Mobile: 91-09444073160
> Email: optoramesh_gp@ ..., sver@...
>
>
> Confidentiality Notice: This transmittal is a confidential communication. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error and that any review, dissemination, distribution or copying of this transmittal is strictly prohibited. If you have received this communication in error, please notify this office immediately by reply and immediately delete this message and all of its attachments, if any.
>



#3446 From: "Madan Gopal Kundu" <madan4331@...>
Date:: Sat May 9, 2009 6:01 pm
Subject:: Re: Testing Normality of biological variables
madan4331
Offline Offline
Send Email Send Email
 
Hi Ramesh,

The problem you stated regarding normality test by KS test is genuine and this
is true for other normality tests as well. The moral is we should not decide
normality of a data only just looking at p-value of normality tests.

To overcome this we should follow any of the following strategies:
1. First increase the significance level i.e. instead of using 5% use 10%
significance level.
2. Use Quantile-Qunatile (Q-Q) plot to decide normality.

I think the second option is better and this is available with standard
softwares such as SAS. A nearly straight line in Q-Q plot indicates normality. A
significant deviation from straight line indicates non-normality. An alternative
of Q-Q plot is Probability - Probability (P-P) plot.

Coming to your second question, I strongly discourage you to use Bonferroni
methods or other similar methods to adjust multiplicity issue. Use of Bonferroni
method ultimately decreases p-value and which in turn increases the possibility
of declaring data as non-normal. So DON'T use it.

Hope this helps you.

Thanks & Regards
Madan Gopal Kundu


--- In Statisticians_group@..., "Ramesh S.Ve." <optoramesh_gp@...>
wrote:
>
> Dear Friends
> I was working on one of our data related to anatomical measurements of the
nerve size & area in the eye... Our sample size is 450 subjects, surprisingly
when each parameter was tested for normality showed siginificant (p<0.05) non
normal distribution as results in Kolmogrov Smirnov. Though the central limiting
theorem suggests that a sample greater than 30 would be normal, bio statistical
variables are usually not so.... (I would like to know how many of you would
agree).... Shall we continue performing a parametric test such as t test, anova
or should we opt for non parametric alternatives in analysisng these data
sets.... Kindly advice:
>
> 1. Better methods to test for normality
> 2. Like bonferroni correction for multiple t test is there a correction factor
(adjusting the significance level of p value) when 4 or 5 variable are tested
together using  independent KS test
>
> Thanks in advance
>
> Regards
>  Ramesh.S.Ve
> PhD student
> Sankara Nethralaya
> 18, College Road
> Chennai-600006.
> Tamil Nadu
> India.
> Tel: 91-44-28271616
> Fax: 91-44-28254180
> Mobile: 91-09444073160
> Email: optoramesh_gp@..., sver@...
>
>
> Confidentiality Notice: This transmittal is a confidential communication. If
you are not the intended recipient, you are hereby notified that you have
received this transmittal in error and that any review, dissemination,
distribution or copying of this transmittal is strictly prohibited. If you have
received this communication in error, please notify this office immediately by
reply and immediately delete this message and all of its attachments, if any.
>

#3445 From: slop badgerd <slopbadgerd@...>
Date:: Sat May 9, 2009 11:17 am
Subject:: RE: Testing Normality of biological variables
senoryardstick
Offline Offline
Send Email Send Email
 
are your subjects all the same sex? if not then this might be causing the non-normality. nerve size might be normally distributed in males and females separately, but if they have different means (eg if males nerves are bigger than female nerves) then when you combine males and females you might get a bi-modal distribution (a peak at the two means for males and females) rather than a normal distribution. try looking at the males and females separately. if its still not normal then its probably safer to use tests that dont assume normality
 
i think bonferroni correction can be applied to any series of independent tests (not just t-tests)
 

To: Statisticians_group@...
From: optoramesh_gp@...
Date: Sat, 9 May 2009 02:58:39 -0700
Subject: [Statisticians_group] Testing Normality of biological variables



Dear Friends
I was working on one of our data related to anatomical measurements of the nerve size & area in the eye... Our sample size is 450 subjects, surprisingly when each parameter was tested for normality showed siginificant (p<0.05) non normal distribution as results in Kolmogrov Smirnov. Though the central limiting theorem suggests that a sample greater than 30 would be normal, bio statistical variables are usually not so.... (I would like to know how many of you would agree).... Shall we continue performing a parametric test such as t test, anova or should we opt for non parametric alternatives in analysisng these data sets.... Kindly advice:

1. Better methods to test for normality
2. Like bonferroni correction for multiple t test is there a correction factor (adjusting the significance level of p value) when 4 or 5 variable are tested together using  independent KS test

Thanks in advance

Regards
 Ramesh.S.Ve
PhD student
Sankara Nethralaya
18, College Road
Chennai-600006.
Tamil Nadu
India.
Tel: 91-44-28271616
Fax: 91-44-28254180
Mobile: 91-09444073160
Email: optoramesh_gp@yahoo.com, sver@snmail.org

Confidentiality Notice: This transmittal is a confidential communication. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error and that any review, dissemination, distribution or copying of this transmittal is strictly prohibited. If you have received this communication in error, please notify this office immediately by reply and immediately delete this message and all of its attachments, if any.





Surfing the web just got more rewarding. Download the New Internet Explorer 8

#3444 From: "Ramesh S.Ve." <optoramesh_gp@...>
Date:: Sat May 9, 2009 9:58 am
Subject:: Testing Normality of biological variables
optoramesh_gp
Offline Offline
Send Email Send Email
 
Dear Friends
I was working on one of our data related to anatomical measurements of the nerve size & area in the eye... Our sample size is 450 subjects, surprisingly when each parameter was tested for normality showed siginificant (p<0.05) non normal distribution as results in Kolmogrov Smirnov. Though the central limiting theorem suggests that a sample greater than 30 would be normal, bio statistical variables are usually not so.... (I would like to know how many of you would agree).... Shall we continue performing a parametric test such as t test, anova or should we opt for non parametric alternatives in analysisng these data sets.... Kindly advice:

1. Better methods to test for normality
2. Like bonferroni correction for multiple t test is there a correction factor (adjusting the significance level of p value) when 4 or 5 variable are tested together using  independent KS test

Thanks in advance

Regards
 Ramesh.S.Ve
PhD student
Sankara Nethralaya
18, College Road
Chennai-600006.
Tamil Nadu
India.
Tel: 91-44-28271616
Fax: 91-44-28254180
Mobile: 91-09444073160
Email: optoramesh_gp@..., sver@...

Confidentiality Notice: This transmittal is a confidential communication. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error and that any review, dissemination, distribution or copying of this transmittal is strictly prohibited. If you have received this communication in error, please notify this office immediately by reply and immediately delete this message and all of its attachments, if any.



#3443 From: Ghada Aby sheasha <gsheasha@...>
Date:: Sat May 9, 2009 6:19 am
Subject:: Re: Fw: Chi Square Test
gsheasha
Offline Offline
Send Email Send Email
 

 In this case you need to do Chi-square correction. try this site: http://www.people.ku.edu/~preacher/chisq/chisq.htm
 
Best regards,
Dr. Ghada Ahmed Abu Sheasha
Ass. lecturer in Bio-medical Informatics and Medical Statistics Department.
Medical Research Institute
Alexandria University



From: ramesh vijayalayan <ramesh_vijayalayan2002@...>
To: Statisticians_group <Statisticians_group@...>
Sent: Saturday, May 9, 2009 7:11:36 AM
Subject: [Statisticians_group] Fw: Chi Square Test



--- On Sat, 9/5/09, ramesh vijayalayan <ramesh_vijayalayan2 002@yahoo. co.in> wrote:

From: ramesh vijayalayan <ramesh_vijayalayan2 002@yahoo. co.in>
Subject: Chi Square Test
To: Statisticians_ group@yahoogroup
Date: Saturday, 9 May, 2009, 9:40 AM

Please clarify my doubt.

I have cross tabulation 4x5 with total frequency 100, but several cell frequencies (say 7) less than 5.  How can I use chi square test for this table?  Can I use Microsoft Excel for chi square test?

Ramesh


Now surf faster and smarter ! Check out the new Firefox 3 - Yahoo! Edition * Click here!


Own a website.Get an unlimited package.Pay next to nothing.* Click here!.



#3442 From: ramesh vijayalayan <ramesh_vijayalayan2002@...>
Date:: Sat May 9, 2009 4:11 am
Subject:: Fw: Chi Square Test
ramesh_vijay...
Offline Offline
Send Email Send Email
 


--- On Sat, 9/5/09, ramesh vijayalayan <ramesh_vijayalayan2002@...> wrote:

From: ramesh vijayalayan <ramesh_vijayalayan2002@...>
Subject: Chi Square Test
To: Statisticians_ group@yahoogroup
Date: Saturday, 9 May, 2009, 9:40 AM

Please clarify my doubt.

I have cross tabulation 4x5 with total frequency 100, but several cell frequencies (say 7) less than 5.  How can I use chi square test for this table?  Can I use Microsoft Excel for chi square test?

Ramesh


Now surf faster and smarter ! Check out the new Firefox 3 - Yahoo! Edition * Click here!


Own a website.Get an unlimited package.Pay next to nothing.* Click here!.

#3441 From: Hasan ALi <street192005@...>
Date:: Fri May 8, 2009 5:46 am
Subject:: Re: (unknown)
street192005
Offline Offline
Send Email Send Email
 


--- On Wed, 5/6/09, Khassoum Diallo <kdiallo@...> wrote:

From: Khassoum Diallo <kdiallo@...>
Subject: Re: [Statisticians_group] (unknown)
To: Statisticians_group@...
Date: Wednesday, May 6, 2009, 12:21 AM
 
Thanx for answering me i checked the link and got cleared
regards

 
 
Dear Hasan,
 
You need a number of data points or values in order to estimate a parameter (e.g. the mean). The degrees of freedom is the minimum number of these data points.
 
A good explanation is given at the following link.
 
 
Cheers,
Khassoum 


--- On Wed, 5/6/09, Hasan ALi <street192005@ yahoo.com> wrote:

From: Hasan ALi <street192005@ yahoo.com>
Subject: [Statisticians_ group] (unknown)
To: Statisticians_ group@yahoogroup s.co.in
Date: Wednesday, May 6, 2009, 7:43 AM



i had read out almost all the books of statistics but i am still dont clear about degrees of freedom.is there any body who can answer so simply that could be digestable
regards






#3440 From: Hasan ALi <street192005@...>
Date:: Thu May 7, 2009 3:52 pm
Subject:: Re: (unknown)
street192005
Offline Offline
Send Email Send Email
 


--- On Thu, 5/7/09, Madan Kundu <madan4331@...> wrote:

From: Madan Kundu <madan4331@...>
Subject: Re: [Statisticians_group] (unknown)
To: Statisticians_group@...
Date: Thursday, May 7, 2009, 12:53 AM

Let me try!
 
Suppose you have 4 variables, say, x1, x2, x3, x4. Now you know their sum i.e. x1+x2+x3+x4= n, where n is some known number.
 
Now here you can vary the value of only 3 variables, becasue the value of 4th variable will be calculated by substracting the sum of the 3 other variables from n. Since here you can vary only 3 variables at your will, degrees of freedom will be 3, not the 4.
 
for example, if x1+x2+x3+x4= 120, and the value of x4 must be 120-x1-x2-x3. See here you can only vary the value of x1, x2 and x3. It won't allow you to vary the value of x4 as it will be calculated from 120-x1-x2-x3.
 
In general, we can say that degress of freedom will be equal to no. of the variables minus the no. of constraints.
 
In the above example, we had 4 variables and one constraint that is there sum will be 120. So in this case degrees of freedom will be 4-1=3.
 
Hope this helps.
 
Regards

------------ --

Madan Gopal Kundu
Biostatistician, Ranbaxy Labs. Ltd.
Gurgaon, Haryana
India
mobile: 91-9868788406
 


--- On Wed, 6/5/09, Hasan ALi <street192005@ yahoo.com> wrote:

From: Hasan ALi <street192005@ yahoo.com>
Subject: [Statisticians_ group] (unknown)
To: Statisticians_ group@yahoogroup s.co.in
Date: Wednesday, 6 May, 2009, 12:13 PM

i had read out almost all the books of statistics but i am still dont clear about degrees of freedom.is there any body who can answer so simply that could be digestable
regards

mr. kandu thanx for ur answer, i already know about the theory that u mentioned but the aim of my question is ask about what are the benefits while knowing the sum of all variables and to exclude on of them. what is the significance of removing the variable.if we remove the variable so is there any specific variable that we can remove or just to subtract from the total.
regards


Now surf faster and smarter ! Check out the new Firefox 3 - Yahoo! Edition * Click here!


#3439 From: Madan Kundu <madan4331@...>
Date:: Thu May 7, 2009 7:53 am
Subject:: Re: (unknown)
madan4331
Offline Offline
Send Email Send Email
 
Let me try!
 
Suppose you have 4 variables, say, x1, x2, x3, x4. Now you know their sum i.e. x1+x2+x3+x4=n, where n is some known number.
 
Now here you can vary the value of only 3 variables, becasue the value of 4th variable will be calculated by substracting the sum of the 3 other variables from n. Since here you can vary only 3 variables at your will, degrees of freedom will be 3, not the 4.
 
for example, if x1+x2+x3+x4=120, and the value of x4 must be 120-x1-x2-x3. See here you can only vary the value of x1, x2 and x3. It won't allow you to vary the value of x4 as it will be calculated from 120-x1-x2-x3.
 
In general, we can say that degress of freedom will be equal to no. of the variables minus the no. of constraints.
 
In the above example, we had 4 variables and one constraint that is there sum will be 120. So in this case degrees of freedom will be 4-1=3.
 
Hope this helps.
 
Regards

--------------

Madan Gopal Kundu
Biostatistician, Ranbaxy Labs. Ltd.
Gurgaon, Haryana
India
mobile: 91-9868788406
 


--- On Wed, 6/5/09, Hasan ALi <street192005@...> wrote:

From: Hasan ALi <street192005@...>
Subject: [Statisticians_group] (unknown)
To: Statisticians_group@...
Date: Wednesday, 6 May, 2009, 12:13 PM

i had read out almost all the books of statistics but i am still dont clear about degrees of freedom.is there any body who can answer so simply that could be digestable
regards



Now surf faster and smarter ! Check out the new Firefox 3 - Yahoo! Edition * Click here!

#3438 From: Madan Kundu <madan4331@...>
Date:: Thu May 7, 2009 7:41 am
Subject:: Re: Calculation of Confidence Interval
madan4331
Offline Offline
Send Email Send Email
 
Generally to calculate Confidence Interval for ratio data, we calculate it using Fieller's theorem. I think same will be applicable in your case as well.
 
Hope this helps!
 
Regards
--------------

Madan Gopal Kundu
Biostatistician, Ranbaxy Labs. Ltd.
Gurgaon, Haryana
India
mobile: 91-9868788406
 


--- On Wed, 6/5/09, biotechbs <gedoardo83@...> wrote:

From: biotechbs <gedoardo83@...>
Subject: [Statisticians_group] Calculation of Confidence Interval
To: Statisticians_group@...
Date: Wednesday, 6 May, 2009, 9:13 PM

Hi everybody,

This is my question. I have two set of data: set A (from A1 to A8) and set B (also from B1 to B8). Each value of both sets are calculated as average of a distinct group of measures and has its associated standard deviation.

Then I have another set of value (set C, the one I want to plot as final results) which is calculated as the ratio between each A value and correspondent B value (for example A1/B1, A2/B2, ecc.).

How can I calculate a confidence interval for set C values to use to plot error bars?

Thanks a lot.
Edoardo



Now surf faster and smarter ! Check out the new Firefox 3 - Yahoo! Edition * Click here!

#3437 From: Andrew Hartley <khahstats@...>
Date:: Wed May 6, 2009 4:45 pm
Subject:: Re: Calculation of Confidence Interval
khahstats
Offline Offline
Send Email Send Email
 
Edoardo,
I assume you want a confidence interval for the mean of the population which is generating Dataset C? One approach would be to calculate the approximate expectation & variance of each C_i=A_i/B_i, & use those to get a conf interval CI assuming approximate normality of C_i. E(C_i) & Var(C_i) are based on a first order Taylor expansion. If you need me to write them up for you I can probably do that next week, but I don't have them available at the moment.
 
On the philosophical side: Be aware that a CI does not indicate what most people think it indicates. The "95%" in "95% CI" means that, if you repeated the experiment many times, about 95% of the CIs would contain the parameter. We are not generally entitled to say that the probability is 95% that any given CI contains the parameter. Thus, the inferential meaning of a CI is unclear.

--- On Wed, 5/6/09, biotechbs <gedoardo83@...> wrote:

From: biotechbs <gedoardo83@...>
Subject: [Statisticians_group] Calculation of Confidence Interval
To: Statisticians_group@...
Date: Wednesday, May 6, 2009, 11:43 AM

Hi everybody,

This is my question. I have two set of data: set A (from A1 to A8) and set B (also from B1 to B8). Each value of both sets are calculated as average of a distinct group of measures and has its associated standard deviation.

Then I have another set of value (set C, the one I want to plot as final results) which is calculated as the ratio between each A value and correspondent B value (for example A1/B1, A2/B2, ecc.).

How can I calculate a confidence interval for set C values to use to plot error bars?

Thanks a lot.
Edoardo



#3436 From: "biotechbs" <gedoardo83@...>
Date:: Wed May 6, 2009 3:43 pm
Subject:: Calculation of Confidence Interval
biotechbs
Offline Offline
Send Email Send Email
 
Hi everybody,

This is my question. I have two set of data: set A (from A1 to A8) and set B
(also from B1 to B8). Each value of both sets are calculated as average of a
distinct group of measures and has its associated standard deviation.

Then I have another set of value (set C, the one I want to plot as final
results) which is calculated as the ratio between each A value and correspondent
B value (for example A1/B1, A2/B2, ecc.).

How can I calculate a confidence interval for set C values to use to plot error
bars?

Thanks a lot.
Edoardo

#3435 From: Khassoum Diallo <kdiallo@...>
Date:: Wed May 6, 2009 7:21 am
Subject:: Re: (unknown)
kdiallo
Offline Offline
Send Email Send Email
 
 
Dear Hasan,
 
You need a number of data points or values in order to estimate a parameter (e.g. the mean). The degrees of freedom is the minimum number of these data points.
 
A good explanation is given at the following link.
 
 
Cheers,
Khassoum 


--- On Wed, 5/6/09, Hasan ALi <street192005@...> wrote:

From: Hasan ALi <street192005@...>
Subject: [Statisticians_group] (unknown)
To: Statisticians_group@...
Date: Wednesday, May 6, 2009, 7:43 AM



i had read out almost all the books of statistics but i am still dont clear about degrees of freedom.is there any body who can answer so simply that could be digestable
regards





#3434 From: Hasan ALi <street192005@...>
Date:: Wed May 6, 2009 6:43 am
Subject:: (No subject)
street192005
Offline Offline
Send Email Send Email
 
i had read out almost all the books of statistics but i am still dont clear about degrees of freedom.is there any body who can answer so simply that could be digestable
regards


#3433 From: Rakesh Saroj <coolvershabhu@...>
Date:: Tue May 5, 2009 7:14 am
Subject:: Vanancy
coolvershabhu
Offline Offline
Send Email Send Email
 
#3432 From: Andrew Hartley <khahstats@...>
Date:: Sun May 3, 2009 6:50 pm
Subject:: RE: pearson correlation test with permutation
khahstats
Offline Offline
Send Email Send Email
 

David,

Yes I hope someone in the group has worked with your type of spatial correlation data. True, you would need computer software to permute even an appreciable number of those possible combinations of 10 pairs of nests.

I still don’t see how the correlation could be exactly zero; everything is correlated with everything else in this world. You know the saying: A butterfly flapping its wings in North America can start a chain of events that leads to a hurricane in China. But anyway that’s not the most important issue here. Rather I want to alert you that a dataset could actually support the null hypothesis Ho over the alternative Ha, no matter how small is the p-value. So, if you insist that Ho is a priori possible, then Ho could have probability >50% even with a very small p-value. This result is known as Lindley’s Paradox. Therefore, it is not correct to “infer that dispersal is probably not random” using the p-value. The problem is that a p-value is a statement about data assuming this or that hypothesis, rather than a statement about any hypothesis given data. --- On Sun, 5/3/09, slop badgerd <slopbadgerd@...> wrote:


From: slop badgerd <slopbadgerd@...>
Subject: RE: [Statisticians_group] pearson correlation test with permutation
To: statisticians_group@...
Date: Sunday, May 3, 2009, 2:05 PM

thanks for your your advice andrew, though im not sure how i would go about permuting all the possible combinations of 10 nests without some sort of computer program, as it would surely take a long time to do it manually.
 
"David, if you obtain a p-value, what would it tell you about your proposed null hypothesis (H) of absolutely no correlation between distance D & relatedness R? A p-value will only indicate the probability of data at least as extreme as what you observed, assuming H; it’s a statement about data, rather than about H. Besides, I would have a difficult time accepting a priori that H could be true in the first place, & since I’m already (almost) sure H is false, I have little reason to test it."
 
the null hypothesis of no correlation between distance and relatedness represents what would be expected if the bees dispersed to a random site within their patch (which is certainly very plausible). if the pearson correlation coefficient was negative and the p-value was low (less than 0.05 for my purposes) i could infer that dispersal is probably not random, but that the bees disperse only short distances and set up nests close to their sisters (which is also plausible). conversely, if there was statistically significant positive correlation, that would suggest the bees actively disperse as far away as possible from their sisters, which again is ecologically interesting and plausible (e.g. to avoid competing against their relatives).
 
"If I was in your position I think I would rather determine the probability that the correlation coefficient (rho, say) is within some practically meaningful range, [a,b]. viz., the approach should rather be one of estimating the correlation, not seeing whether the correlation is exactly zero."
 
if there is a real correlation i have no reason to expect that the correlation will be a particular value (or within a particular range of values) and in fact it isn't particularly important in my case as (for what i am researching at this stage anyway) it doesnt really matter if the correlation is 0.3 versus 0.5 say - the ecological implications would be similar. in contrast, the ecological implications of any correlation (whatever its exact value) would be different from no correlation. hence at this stage im really only interested in testing whether my observed correlation is statistically different from zero. however certainly in the future testing whether any correlation is within a particular range of values could be an interesting follow up question and i thank you for the suggestion.
 
but at this stage im still just looking for a way of conducting a permutation- based significance test (2-tailed) for the pearson correlation between the 2 variables. if anyone has any suggestions it really would be a huge help.
 
david

 

To: Statisticians_ group@yahoogroup s.co.in
From: khahstats@yahoo. com
Date: Sat, 2 May 2009 21:35:13 -0700
Subject: Re: [Statisticians_ group] pearson correlation test with permutation



David, if you obtain a p-value, what would it tell you about your proposed null hypothesis (H) of absolutely no correlation between distance D & relatedness R? A p-value will only indicate the probability of data at least as extreme as what you observed, assuming H; it’s a statement about data, rather than about H. Besides, I would have a difficult time accepting a priori that H could be true in the first place, & since I’m already (almost) sure H is false, I have little reason to test it.

 

If I was in your position I think I would rather determine the probability that the correlation coefficient (rho, say) is within some practically meaningful range, [a,b]. viz., the approach should rather be one of estimating the correlation, not seeing whether the correlation is exactly zero.

 

When all the data (D & R) are i.i.d., this can be done using the posterior probability distribution for rho given in

http://www.pubmedce ntral.nih. gov/picrender. fcgi?artid= 155684&blobtype=pdf

Your case may be a little more complicated because, as you note, the data between any pair of nests A & B depend on the data between (say) A & C (i.e., whenever pairs of nests share a member). I don’t have any experience with this type of spatial analysis; however, I would guess that since you are trying to infer something about the relation between D & R, rather than D & R themselves, I would guess that taking all 21*20/2=210 pairs of D & R as independent would not bias the sample statistic (rho-hat); it would only complicate the calculation of the variance of rho-hat (conditional on rho) & hence of the spread of the posterior probability distribution of rho.

 

You are looking for an answer ASAP. Therefore, without researching this in depth, I would say that you could

  1. Split the 21 nests into 2 sets of 10 A & B, leaving 1 out.
  2. Calculate rho-hat for those 10 pairs of nests.
  3. Form 2 sets of 10 nests each in a new way.
  4. Calculate rho-hat for those 10 new pairs of nests.
  5. Iterate the above until you have done this for all possible setups of 10 pairs.
  6. Calculate the mean of all the resulting rho-hats.
  7. Compare that mean with the rho-hat you get when treating all the 210 pairs of D & R as independent.

You could do something similar, I imagine, comparing 2 ways of calculating the variance of rho-hat (conditional on rho), making an adjustment for the square root of the sample size.

 

I don’t know how good is this method of handling the dependencies; someone else may be able to provide a better answer. Nonetheless, I do feel strongly that the focus should be on estimation rather than significance testing. Best wishes.


--- On Sat, 5/2/09, slop badgerd <slopbadgerd@ hotmail.com> wrote:

From: slop badgerd <slopbadgerd@ hotmail.com>
Subject: [Statisticians_ group] pearson correlation test with permutation
To: statisticians_ group@yahoogroup s.co.in
Date: Saturday, May 2, 2009, 9:00 PM

hello everyone, im new to the group. just thought i'd say hello and ask for some help with a stats problem. im studying bees and am trying to find if there is a correlation between the distance between a pair of nests and the genetic relatedness between the bees occupying the nests. there is 21 nests in the sample and have i created a pairwise distance matrix showing the distance between each pair of nests, and similarly, a pairwise relatedness matrix for each pair of nests. it is my understanding that you cant use tables to get the statistical significance (p-value) of the pearson's correlation coefficient because the data are not independent (if you change the position of one nest, all the pairwise distance values from that nest to all the others will change) and therefore statistical significance must be calculated by permuting the data to generate a null distribution.
 
i have used the mantel test in the program zt to calculate the pearson correlation and its significance. however i now need to test for a correlation using only a subset of the data, such that it is not possible to create a full matrix (i.e. there will be gaps in the distance and relatedness matrices), and so the mantel test does not work properly.
 
the data is in 2 columns and i have done a normal pearson correlation test on it. however to calculate the p-value i assume i need to permute the data to create the null distribution. i have been looking on the internet for a program that can do this but i've had no luck. i downloaded a program called corrperm that looked like it could do the job but i couldn't figure out how to work it. i emailed the guy who wrote it and he replied with a new version of the program for R but i dont have a clue how to use that (i have no programming knowledge or experience with anything of that sort). the guy's away now for a week and i need to do this test asap so if anyone has any advice it would be really appreciated, e.g. is there any simple program out there that can run on windows command prompt with clear instructions how to use it?
 
thanks in advance,
 
david



Share your photos with Windows Live Photos – Free. Try it Now!




Get the New Internet Explore 8 Optimised for MSN. Download Now


#3431 From: slop badgerd <slopbadgerd@...>
Date:: Sun May 3, 2009 6:05 pm
Subject:: RE: pearson correlation test with permutation
senoryardstick
Offline Offline
Send Email Send Email
 
thanks for your your advice andrew, though im not sure how i would go about permuting all the possible combinations of 10 nests without some sort of computer program, as it would surely take a long time to do it manually.
 
"David, if you obtain a p-value, what would it tell you about your proposed null hypothesis (H) of absolutely no correlation between distance D & relatedness R? A p-value will only indicate the probability of data at least as extreme as what you observed, assuming H; its a statement about data, rather than about H. Besides, I would have a difficult time accepting a priori that H could be true in the first place, & since Im already (almost) sure H is false, I have little reason to test it."
 
the null hypothesis of no correlation between distance and relatedness represents what would be expected if the bees dispersed to a random site within their patch (which is certainly very plausible). if the pearson correlation coefficient was negative and the p-value was low (less than 0.05 for my purposes) i could infer that dispersal is probably not random, but that the bees disperse only short distances and set up nests close to their sisters (which is also plausible). conversely, if there was statistically significant positive correlation, that would suggest the bees actively disperse as far away as possible from their sisters, which again is ecologically interesting and plausible (e.g. to avoid competing against their relatives).
 
"If I was in your position I think I would rather determine the probability that the correlation coefficient (rho, say) is within some practically meaningful range, [a,b]. viz., the approach should rather be one of estimating the correlation, not seeing whether the correlation is exactly zero."
 
if there is a real correlation i have no reason to expect that the correlation will be a particular value (or within a particular range of values) and in fact it isn't particularly important in my case as (for what i am researching at this stage anyway) it doesnt really matter if the correlation is 0.3 versus 0.5 say - the ecological implications would be similar. in contrast, the ecological implications of any correlation (whatever its exact value) would be different from no correlation. hence at this stage im really only interested in testing whether my observed correlation is statistically different from zero. however certainly in the future testing whether any correlation is within a particular range of values could be an interesting follow up question and i thank you for the suggestion.
 
but at this stage im still just looking for a way of conducting a permutation-based significance test (2-tailed) for the pearson correlation between the 2 variables. if anyone has any suggestions it really would be a huge help.
 
david

 

To: Statisticians_group@...
From: khahstats@...
Date: Sat, 2 May 2009 21:35:13 -0700
Subject: Re: [Statisticians_group] pearson correlation test with permutation



David, if you obtain a p-value, what would it tell you about your proposed null hypothesis (H) of absolutely no correlation between distance D & relatedness R? A p-value will only indicate the probability of data at least as extreme as what you observed, assuming H; its a statement about data, rather than about H. Besides, I would have a difficult time accepting a priori that H could be true in the first place, & since Im already (almost) sure H is false, I have little reason to test it.

 

If I was in your position I think I would rather determine the probability that the correlation coefficient (rho, say) is within some practically meaningful range, [a,b]. viz., the approach should rather be one of estimating the correlation, not seeing whether the correlation is exactly zero.

 

When all the data (D & R) are i.i.d., this can be done using the posterior probability distribution for rho given in

http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=155684&blobtype=pdf

Your case may be a little more complicated because, as you note, the data between any pair of nests A & B depend on the data between (say) A & C (i.e., whenever pairs of nests share a member). I dont have any experience with this type of spatial analysis; however, I would guess that since you are trying to infer something about the relation between D & R, rather than D & R themselves, I would guess that taking all 21*20/2=210 pairs of D & R as independent would not bias the sample statistic (rho-hat); it would only complicate the calculation of the variance of rho-hat (conditional on rho) & hence of the spread of the posterior probability distribution of rho.

 

You are looking for an answer ASAP. Therefore, without researching this in depth, I would say that you could

  1. Split the 21 nests into 2 sets of 10 A & B, leaving 1 out.
  2. Calculate rho-hat for those 10 pairs of nests.
  3. Form 2 sets of 10 nests each in a new way.
  4. Calculate rho-hat for those 10 new pairs of nests.
  5. Iterate the above until you have done this for all possible setups of 10 pairs.
  6. Calculate the mean of all the resulting rho-hats.
  7. Compare that mean with the rho-hat you get when treating all the 210 pairs of D & R as independent.

You could do something similar, I imagine, comparing 2 ways of calculating the variance of rho-hat (conditional on rho), making an adjustment for the square root of the sample size.

 

I dont know how good is this method of handling the dependencies; someone else may be able to provide a better answer. Nonetheless, I do feel strongly that the focus should be on estimation rather than significance testing. Best wishes.


--- On Sat, 5/2/09, slop badgerd <slopbadgerd@hotmail.com> wrote:

From: slop badgerd <slopbadgerd@hotmail.com>
Subject: [Statisticians_group] pearson correlation test with permutation
To: statisticians_group@yahoogroups.co.in
Date: Saturday, May 2, 2009, 9:00 PM

hello everyone, im new to the group. just thought i'd say hello and ask for some help with a stats problem. im studying bees and am trying to find if there is a correlation between the distance between a pair of nests and the genetic relatedness between the bees occupying the nests. there is 21 nests in the sample and have i created a pairwise distance matrix showing the distance between each pair of nests, and similarly, a pairwise relatedness matrix for each pair of nests. it is my understanding that you cant use tables to get the statistical significance (p-value) of the pearson's correlation coefficient because the data are not independent (if you change the position of one nest, all the pairwise distance values from that nest to all the others will change) and therefore statistical significance must be calculated by permuting the data to generate a null distribution.
 
i have used the mantel test in the program zt to calculate the pearson correlation and its significance. however i now need to test for a correlation using only a subset of the data, such that it is not possible to create a full matrix (i.e. there will be gaps in the distance and relatedness matrices), and so the mantel test does not work properly.
 
the data is in 2 columns and i have done a normal pearson correlation test on it. however to calculate the p-value i assume i need to permute the data to create the null distribution. i have been looking on the internet for a program that can do this but i've had no luck. i downloaded a program called corrperm that looked like it could do the job but i couldn't figure out how to work it. i emailed the guy who wrote it and he replied with a new version of the program for R but i dont have a clue how to use that (i have no programming knowledge or experience with anything of that sort). the guy's away now for a week and i need to do this test asap so if anyone has any advice it would be really appreciated, e.g. is there any simple program out there that can run on windows command prompt with clear instructions how to use it?
 
thanks in advance,
 
david



Share your photos with Windows Live Photos Free. Try it Now!




Get the New Internet Explore 8 Optimised for MSN. Download Now

#3430 From: Andrew Hartley <khahstats@...>
Date:: Sun May 3, 2009 4:35 am
Subject:: Re: pearson correlation test with permutation
khahstats
Offline Offline
Send Email Send Email
 

David, if you obtain a p-value, what would it tell you about your proposed null hypothesis (H) of absolutely no correlation between distance D & relatedness R? A p-value will only indicate the probability of data at least as extreme as what you observed, assuming H; it’s a statement about data, rather than about H. Besides, I would have a difficult time accepting a priori that H could be true in the first place, & since I’m already (almost) sure H is false, I have little reason to test it.

 

If I was in your position I think I would rather determine the probability that the correlation coefficient (rho, say) is within some practically meaningful range, [a,b]. viz., the approach should rather be one of estimating the correlation, not seeing whether the correlation is exactly zero.

 

When all the data (D & R) are i.i.d., this can be done using the posterior probability distribution for rho given in

http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=155684&blobtype=pdf

Your case may be a little more complicated because, as you note, the data between any pair of nests A & B depend on the data between (say) A & C (i.e., whenever pairs of nests share a member). I don’t have any experience with this type of spatial analysis; however, I would guess that since you are trying to infer something about the relation between D & R, rather than D & R themselves, I would guess that taking all 21*20/2=210 pairs of D & R as independent would not bias the sample statistic (rho-hat); it would only complicate the calculation of the variance of rho-hat (conditional on rho) & hence of the spread of the posterior probability distribution of rho.

 

You are looking for an answer ASAP. Therefore, without researching this in depth, I would say that you could

  1. Split the 21 nests into 2 sets of 10 A & B, leaving 1 out.
  2. Calculate rho-hat for those 10 pairs of nests.
  3. Form 2 sets of 10 nests each in a new way.
  4. Calculate rho-hat for those 10 new pairs of nests.
  5. Iterate the above until you have done this for all possible setups of 10 pairs.
  6. Calculate the mean of all the resulting rho-hats.
  7. Compare that mean with the rho-hat you get when treating all the 210 pairs of D & R as independent.

You could do something similar, I imagine, comparing 2 ways of calculating the variance of rho-hat (conditional on rho), making an adjustment for the square root of the sample size.

 

I don’t know how good is this method of handling the dependencies; someone else may be able to provide a better answer. Nonetheless, I do feel strongly that the focus should be on estimation rather than significance testing. Best wishes.


--- On Sat, 5/2/09, slop badgerd <slopbadgerd@...> wrote:

From: slop badgerd <slopbadgerd@...>
Subject: [Statisticians_group] pearson correlation test with permutation
To: statisticians_group@...
Date: Saturday, May 2, 2009, 9:00 PM

hello everyone, im new to the group. just thought i'd say hello and ask for some help with a stats problem. im studying bees and am trying to find if there is a correlation between the distance between a pair of nests and the genetic relatedness between the bees occupying the nests. there is 21 nests in the sample and have i created a pairwise distance matrix showing the distance between each pair of nests, and similarly, a pairwise relatedness matrix for each pair of nests. it is my understanding that you cant use tables to get the statistical significance (p-value) of the pearson's correlation coefficient because the data are not independent (if you change the position of one nest, all the pairwise distance values from that nest to all the others will change) and therefore statistical significance must be calculated by permuting the data to generate a null distribution.
 
i have used the mantel test in the program zt to calculate the pearson correlation and its significance. however i now need to test for a correlation using only a subset of the data, such that it is not possible to create a full matrix (i.e. there will be gaps in the distance and relatedness matrices), and so the mantel test does not work properly.
 
the data is in 2 columns and i have done a normal pearson correlation test on it. however to calculate the p-value i assume i need to permute the data to create the null distribution. i have been looking on the internet for a program that can do this but i've had no luck. i downloaded a program called corrperm that looked like it could do the job but i couldn't figure out how to work it. i emailed the guy who wrote it and he replied with a new version of the program for R but i dont have a clue how to use that (i have no programming knowledge or experience with anything of that sort). the guy's away now for a week and i need to do this test asap so if anyone has any advice it would be really appreciated, e.g. is there any simple program out there that can run on windows command prompt with clear instructions how to use it?
 
thanks in advance,
 
david



Share your photos with Windows Live Photos – Free. Try it Now!


#3429 From: slop badgerd <slopbadgerd@...>
Date:: Sun May 3, 2009 1:00 am
Subject:: pearson correlation test with permutation
senoryardstick
Offline Offline
Send Email Send Email
 
hello everyone, im new to the group. just thought i'd say hello and ask for some help with a stats problem. im studying bees and am trying to find if there is a correlation between the distance between a pair of nests and the genetic relatedness between the bees occupying the nests. there is 21 nests in the sample and have i created a pairwise distance matrix showing the distance between each pair of nests, and similarly, a pairwise relatedness matrix for each pair of nests. it is my understanding that you cant use tables to get the statistical significance (p-value) of the pearson's correlation coefficient because the data are not independent (if you change the position of one nest, all the pairwise distance values from that nest to all the others will change) and therefore statistical significance must be calculated by permuting the data to generate a null distribution.
 
i have used the mantel test in the program zt to calculate the pearson correlation and its significance. however i now need to test for a correlation using only a subset of the data, such that it is not possible to create a full matrix (i.e. there will be gaps in the distance and relatedness matrices), and so the mantel test does not work properly.
 
the data is in 2 columns and i have done a normal pearson correlation test on it. however to calculate the p-value i assume i need to permute the data to create the null distribution. i have been looking on the internet for a program that can do this but i've had no luck. i downloaded a program called corrperm that looked like it could do the job but i couldn't figure out how to work it. i emailed the guy who wrote it and he replied with a new version of the program for R but i dont have a clue how to use that (i have no programming knowledge or experience with anything of that sort). the guy's away now for a week and i need to do this test asap so if anyone has any advice it would be really appreciated, e.g. is there any simple program out there that can run on windows command prompt with clear instructions how to use it?
 
thanks in advance,
 
david



Share your photos with Windows Live Photos Free. Try it Now!

#3428 From: Andrew Hartley <khahstats@...>
Date:: Sat May 2, 2009 11:06 pm
Subject:: Re: The variance of OLS estimator
khahstats
Offline Offline
Send Email Send Email
 
A Riadi,
for some reason the link you provided is not opening for me. However, if you express the model in matrix form, as
y=x*beta + epsilon,
where the first column of x is all 1's & the first element of beta is the a you are asking about,
then the variance of the OLS estimator of the beta vector is
V=sigma times the inverse of (x`x).
In particular, the (1,1) element of V is the variance of the OLS estimator of a. If this does not make sense to you then you could do well to take the time to learn matrix language.

--- On Sat, 5/2/09, Riadi <a_riadi_a@...> wrote:

From: Riadi <a_riadi_a@...>
Subject: Re: [Statisticians_group] The variance of OLS estimator
To: Statisticians_group@...
Date: Saturday, May 2, 2009, 5:55 PM

Thank you for your advice.
It presents simple model y=a+bx+e, after that it shows how to find the variance of OLS estimator b.
What I want to know is how to find the variance of OLS estimator a ?
 
Any help will be much appreciated. Thank you very much.

--- On Sun, 5/3/09, Andrew Hartley <khahstats@yahoo. com> wrote:
From: Andrew Hartley <khahstats@yahoo. com>
Subject: Re: [Statisticians_ group] The variance of OLS estimator
To: Statisticians_ group@yahoogroup s.co.in
Date: Sunday, May 3, 2009, 2:36 AM

A Riadi,
I advise you to consult a standard textbook on this topic. These variances are almost always presented in introductory statistics texts.

--- On Sat, 5/2/09, A. Riadi <a_riadi_a@yahoo. com> wrote:

From: A. Riadi <a_riadi_a@yahoo. com>
Subject: [Statisticians_ group] The variance of OLS estimator
To: Statisticians_ group@yahoogroup s.co.in
Date: Saturday, May 2, 2009, 8:23 AM

Dear all statisticians_ group member.

Can any body help me to show how to derive the variance of Ordinary Least Squares (OLS) estimator?

y=a+bx+e
I have seen the variance of OLS estimator a and b.
var[a] and var[b], but I don't know how to derive those variance. Please help me. Thank you very much.





#3427 From: Riadi <a_riadi_a@...>
Date:: Sat May 2, 2009 9:55 pm
Subject:: Re: The variance of OLS estimator
a_riadi_a
Offline Offline
Send Email Send Email
 
Thank you for your advice.
It presents simple model y=a+bx+e, after that it shows how to find the variance of OLS estimator b.
What I want to know is how to find the variance of OLS estimator a ?
 
Any help will be much appreciated. Thank you very much.

--- On Sun, 5/3/09, Andrew Hartley <khahstats@...> wrote:
From: Andrew Hartley <khahstats@...>
Subject: Re: [Statisticians_group] The variance of OLS estimator
To: Statisticians_group@...
Date: Sunday, May 3, 2009, 2:36 AM

A Riadi,
I advise you to consult a standard textbook on this topic. These variances are almost always presented in introductory statistics texts.

--- On Sat, 5/2/09, A. Riadi <a_riadi_a@yahoo. com> wrote:

From: A. Riadi <a_riadi_a@yahoo. com>
Subject: [Statisticians_ group] The variance of OLS estimator
To: Statisticians_ group@yahoogroup s.co.in
Date: Saturday, May 2, 2009, 8:23 AM

Dear all statisticians_ group member.

Can any body help me to show how to derive the variance of Ordinary Least Squares (OLS) estimator?

y=a+bx+e
I have seen the variance of OLS estimator a and b.
var[a] and var[b], but I don't know how to derive those variance. Please help me. Thank you very much.




#3426 From: Deniz Senturk <dnz_senturk@...>
Date:: Sat May 2, 2009 7:01 pm
Subject:: CANALS
sea_1881
Offline Offline
Send Email Send Email
 
Hi,
Can anyone please suggest an online source for Categorical Canonical Analysis (CANALS)?


Thank you,
Deniz.


What can you do with the new Windows Live? Find out

#3425 From: Andrew Hartley <khahstats@...>
Date:: Sat May 2, 2009 5:36 pm
Subject:: Re: The variance of OLS estimator
khahstats
Offline Offline
Send Email Send Email
 
A Riadi,
I advise you to consult a standard textbook on this topic. These variances are almost always presented in introductory statistics texts.

--- On Sat, 5/2/09, A. Riadi <a_riadi_a@...> wrote:

From: A. Riadi <a_riadi_a@...>
Subject: [Statisticians_group] The variance of OLS estimator
To: Statisticians_group@...
Date: Saturday, May 2, 2009, 8:23 AM

Dear all statisticians_ group member.

Can any body help me to show how to derive the variance of Ordinary Least Squares (OLS) estimator?

y=a+bx+e
I have seen the variance of OLS estimator a and b.
var[a] and var[b], but I don't know how to derive those variance. Please help me. Thank you very much.



#3424 From: Statistician Statistician <martinholt42@...>
Date:: Sat May 2, 2009 4:48 pm
Subject:: Statistician: Statistician Statistician has sent you a friend message
martinholt42
Offline Offline
Send Email Send Email
 
Hey Statistician,

Just a reminder that Statistician S just sent you a new friend request.

Click below to view your message and reply:
http://www.flixster.com/invite/pending?em.id=467766851

We'll talk later,
Flixster



If you prefer not to be notified about new friend requests from Flixster members, click here.

#3423 From: sekaran L <sekarstats1984@...>
Date:: Sat May 2, 2009 1:47 pm
Subject:: RE:Need Information
sekarstats1984
Offline Offline
Send Email Send Email
 
Hi all,
 
Wish to know SAS Training institutes in chennai Specifically for Clinical Domain, would be more helpful if any one of you suggest the best training institute in chennai that offers training for clinical Research with placement assurance looking forward for your favourable reply in this regards,
 
Thanks in Advance,
 
Regards,
Sekaran.L


Now surf faster and smarter ! Check out the new Firefox 3 - Yahoo! Edition * Click here!

#3422 From: "A. Riadi" <a_riadi_a@...>
Date:: Sat May 2, 2009 12:23 pm
Subject:: The variance of OLS estimator
a_riadi_a
Offline Offline
Send Email Send Email
 
Dear all statisticians_group member.

Can any body help me to show how to derive the variance of Ordinary Least
Squares (OLS) estimator?

y=a+bx+e
I have seen the variance of OLS estimator a and b.
var[a] and var[b], but I don't know how to derive those variance. Please help
me. Thank you very much.

#3421 From: "Bhupinder Farmaha" <bhupi80singh@...>
Date:: Fri May 1, 2009 6:36 pm
Subject:: Urgent help needed
bhupi80singh
Offline Offline
Send Email Send Email
 

Hello friends

I am having hard time to find error in my SAS mixed model. Here is my problem. I am
looking the effect of three variables on soybean yield.

I have RCBD with split-split-plot arrangement. I have two different
environments (random) and blocks (random) are nested within the environment.
Main plots are tillage (3 levels), first split is Phosphorus (4 levels) and
second split is Potassium (4 levels). Tillage, Phosphorus and potassium are
fixed effects.

I have written the following model but unable to find the error. Please guys
help me to run this. I have to report my analysis by tomorrow in an
abstract.

TITLE 'P and K Placement Study';

Options ps=50 ls=74 pageno=1;

DATA Soybean_yield;

Infile "F:\Sp_project.csv" delimiter="," firstobs=2;

Input Year location env Plot_no Block Tillplc$ Prate$ Krate$ Yield;

cards;

run;;

proc mixed data = soybean_yield method = type3;
class env Block env Tillplc Prate Krate Yield;
model yield = Tillplc|Prate|Krate/ddfm=kr;

random env block(env)
env*tillplc tillplc*block(env) env*Prate env*tillplc*Prate tillplc*Prate*block(env)
env*Krate env*tillplc*Krate env*Prate*Krate env*tillplc*Prate*Krate ;
run; quit;

Any help will be much appreciated.

Thanks

Bhupinder

 


#3420 From: Andrew Hartley <khahstats@...>
Date:: Fri May 1, 2009 6:24 pm
Subject:: Percents & Rural/Urban Health Care Professionals
khahstats
Offline Offline
Send Email Send Email
 

Akib,

Your topic is extremely important; I’m glad that people are working on it. This problem of a lack of rural doctors & nurses affects a large number of countries.

An example of something you might want to study is the proportion of health care professionals (HCPs) who migrate to cities because they enjoy city life. Call this proportion “mu.” It’s a “quantity of interest,” or a “parameter.”

Then, I suppose a meaningful result you want to present would be in the format “We can be 95% certain that mu is between 0.2 & 0.3.” This type of result identifies the “strength of the findings” & the degree to which we can “be confident about the conclusions” that you mention.

However, when you focus on “strength of the findings” or being “confident about the conclusions” of this or that percentage, you must keep in mind that standard statistical techniques don’t convey that type of meaning. A 95% confidence interval (CI) is one such technique. Most people (& even many statisticians) believe that a CI is a statement about strength of findings or about confidence in conclusions. It’s not. To understand 95% CIs, you must pretend that we are performing an experiment an infinite number of times, & for each experiment we calculate a 95% CI. The 95% implies that 95% of all these CIs will contain the parameter. The 95% is, then, a confidence about an infinite series of CIs, rather than a confidence that the CI we have obtained contains the parameter. Unfortunately, most people cannot resist the temptation to say that “We are 95% certain this CI contains the parameter.” (If you want me to illustrate how we might be more or less than 95% confident of that, I can do so). Furthermore, since they think we can be 95% that a given CI contains the parameter, they continue to use CIs.

You may be looking for quick & ready answers. However, not knowing what you’ve heard concerning standard statistical techniques such as CIs, I’m taking the time to explain this to help you understand the disconnection between CIs & the “strength” & “confidence” you seek. Does what I’ve written so far make sense to you?


--- On Fri, 5/1/09, akib ul <akib_du@...> wrote:


From: akib ul <akib_du@...>
Subject: Re: [Statisticians_group] Percentage - Percentile?
To: Statisticians_group@...
Date: Friday, May 1, 2009, 3:21 AM

Dear Hartley

Thanks for the reply. The study was a type of opinion survey trying to identify the causes of why health care professionals do not want stay in rural areas, why they want to migrate to cities, or to the developed countries. To collect the data a questionnaire was used where the questions contained economic, political, social, and professional aspects. 

A convenient type of sampling procedure was used to collect date, of course which was neither random nor representative.

The researcher used percentage (based on how many respondents answered a question "yes" among the total 300 respondents) to show the importance of different contributing factors.

Finally, based on the percentage the researcher drew some conclusions and put some suggestions to stop above mentioned migrations.

My question is about the strength of the findings (even if it were a random sample) which are purely based on percentage. To what extent we can be confident about the conclusion of identifying the factors contributing to/responsible for something which is a psychological or sociological phenomenon. It might be that other more important factors are there which are not included in the questionnaire or considered by the researcher. 
In a behavioral research what is the scope of using percetage to come to a conclusion.
Sorry, it has become a long mail. However, the answer will help me to clear up my confusion. Also, a reference article/book (available in the internet) will be helpful.

Regards
Akib
--- On Thu, 4/30/09, Andrew Hartley <khahstats@yahoo. com> wrote:

From: Andrew Hartley <khahstats@yahoo. com>
Subject: [Statisticians_ group] Percentage - Percentile?
To: Statisticians_ group@yahoogroup s.co.in
Date: Thursday, April 30, 2009, 10:08 AM



Akib,
I have difficulty approaching your question without more information. "Percentage" can mean many different things; are you speaking of "percentile? " Can you please explain more about what you want to do? What are you studying? What scientific conclusions do you want to draw? Are you looking at economic, biological, clinical, educational data or what? Are you trying to infer something about the location of a population? How certain do you need to be about that location? Why can't you take a purposive sample?

--- On Thu, 4/30/09, akib ul <akib_du@yahoo. com> wrote:

From: akib ul <akib_du@yahoo. com>
Subject: [Statisticians_ group] please help me
To: Statisticians_ group@yahoogroup s.co.in
Date: Thursday, April 30, 2009, 4:43 AM

Can anyone please tell me about the strength of "percentage" as a statistical tool to draw
conclusion without doing any significance test and using purposive sample.
Regards
Akib





Messages 3420 - 3449 of 4077   Newest  |  < Newer  |  Older >  |  Oldest
Advanced

Copyright 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help