7/7/97
Methods of Empirical Research
designing educational research
research and statistical inference
some measurement
syllabus:
One objective exam with occasional assignments.
==============================
One term that pervades educ is theory.
theory =  an explanation, true, hypothesis, statement,
          measurable,
prediction and unifying explanation, systematic view, tying
together things, 
hypothesis: considered all that went before via research, does
all stuff that has gone before add up to  a hypothesis.
constructs: important term of soc sci; models, parameters, a
postulated underlying behavior that is observed: anxiety
intelligence: inferred from tests 
variable: independent (cause) and dependent (effect) variables, 

Theory hypothesis construct
----------------------------
observable behaviors

How do we get from top to bottom. 
operational definitions

It starts with sci schema and it is redefining.  Test and
redefine, it is self correcting so new hypothesis are created.
++++++++++++++++++++++++++++++++++++++++++++
Problem: Should one engage in premarital intercourse?
not empirical
++++++++++++++++++++++++++++++++++++++++++++
What can be researched: outcomes as compared to other variables.
manipulated variables are a subset of indep var.
indep is always the way.
predict from indep to dep.

amount of reading to the intelligence of students.
which is indep: 
which is dep: 

both are
intelligent pop read
reading makes intelligent

It is in the theory, how the variables are used and defined.

The indep in one becomes dep in another.

_
X = mean
E = sigma or sum
X = a number
N = number of numbers

X
- = deviation from mean

square them = standard deviation  s.d.

     x    x2                            x    x2
5    2    4                     3       0    0
4    1    1                     3       0    0
3    0    0                     3       0    0
2    -1   1                     3       0    0
1    -2   4                     3       0    0
------------------------------------------------
15/5=3                       15/5=3

In the first group the s.d. = 1.4 in the second  = 0
The mean indicates the central tendency while
s.d = variability
     _
     X    s.d.
I    35   5
II   35   10        more diverse
III  40   10
IV   40   0         they cheated

average also may indicate median = midmost score in group.
mode = most freq occurring score.

See notes on paper 1/1  7/7/97

++++++++++++++++++++++++++++++++++++++++++++++++
7/10/97

bell curve
hor axis is scores 
vert axis freq
curve has kind of logic biggest pile in middle and fewer in
either direction.
IQ mean 100 most freq
Use it for a normal or average distribution group.


                         ^
                        /  \
                      /      \
                    - |      | -
                  /   |      |   \
                - |   |      |   | -
               2%  14% 34% 34% 14% 2% 
------------------------------------------------------
                -2   -1   X  1   2

as we go out 1 deviation  see paper notes
IQ
mean = 100
s.d. = 15

70 = bottom 2%
85 = bottom 16%
100 = mean
115 = top 86%
130 = top 2%

SAT
mean = 500
s.d. = 100
see 7/10 notes

other classifications of variables:

var as stimuli and response var. 
stimulus is something that causes a reaction 
ie envir that may effect behavior  home, time spent on instruc,
methods, teacher, 
response: how they react to it

intervening var is in between like being deaf, background,
maturity, language, hunger, sleep, 
active and assigned var:

NOT fire which is disaster.

active one we manipulate, experimenter signs and are indep
assigned we assign a value or a symbol like IQ, M/F, be either
indep or dep

math properties, or values is another kind of var
prob that cant be handled empirically like ethics

What is empirical?
Problem: How have computers effected the amount of writing a
student produces as compared to those who do not use the
computer.

Hypothesis: Students who use the computer are more apt to write
more than those who don't use the computer.

no incomplete constructions
problem: How does.. effect..
Keep specific measurement tool out
the prob = the theory so if the theory is not strong than the hyp
will be weak.
be specific with words, define them
use research words like relate to not correlate.
keep jargon out, KISS

Summary:
Prob : Write in terms of questions, but if sponsor wants
statement then defer to sponsor.  
Variables should be testable.
Tell problem early and include var.
your hyp is your conjecture of how the var are related.
hyp tells nature of relationship.  
Don't switch words.
don't use broad terms, find balance between specific and broad.
stay away from euphemism
avoid mythological groups
scores on X test 


SAMPLING:
universe or population: defined group
distinction all conceivable elements 
IQ's = althe IQ of US
part of group, no method of selection, define
any portion of that pop
random sample: any member of that group can be selected.
Ramdom sampling.
7/14/97
Random sampling, don't take numbers out of hat, there is a
procedure to do proper random sampling. Number every member of
whole samples.  Then use table of randomizer.  No pattern is
discernible, how do you get these.  carry out square of 2 out....

see page 640-3 for random numbers:

using the chart, we closed our eyes and placed it on the page. 
Did this 3 times.  In each case we chose 10 single digits in the
col, found the average.  When we charted the class we a bell
curve.  Why?

Central limit therom:  Even when the pop. departs from narmality,
the sampling distribution of the statistics are normally
distributed. What this means is that when we strted with the
"random numbers" (the pool) we had a rectangle, an equal amount
of choices* and in a sampling we will arrive at the bell.

*It is equal because we have equal number of all numbers: x9's,
x8's, x7's....  Even if we had a pop of IQ's in TC, the pool may
reflect on the high side, so the shape of the pool will be odd
shaped, the result wil be bell.
                       _
Standard Error of Mean X

      s.d.         _                                      
    -------- =  SE X
      ____                                            
    \/n-1                        
                                          
  

example:
IQ's of stud at TC.
   _
   X = 120
s.d. = 15
   n = 10
    _         ____
 SE X = 15/ \/10-1  
    _         ___
 SE X = 15/ \/ 9
    _
 SE X = 15/3 = 5

                                 ^
                             _ / | \_
                          _/  |  |   |\_
                        _/    |  |   |   \_
                      _/      |  |   |      \_
                     /        |  |   |         \
                ------------------------------------
                           115  120 125

The intervals are called "confidence" levels.  In this case 5. 
This tells me that 68%of the pop is between 115 an 125.  95%
would be betwen 110 and 130.  The cofidence level is low.

In a small sample, too much error.
As the sample goes up, the error goes down.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
7/16/97

simple random samples
take pop number every one in pop and let table pick.

diff men and women
     stratified set: subdivide pop into groups, category, number
          of objects is proportionate to pop.

In NYC:
homo and hetero grouping of retarded
educable?
variable = IQ

cluster sampling: going to a sch pick n number of kids, from
     within know groups.  This is economic

These are the building blocks. Pollsters know this the best.

misconceeptions ain't true.  
     Random is not same as haphazard:
     Random is diff to achieve:
     RS is not necesarily representative sampling:
     RS is ramdom is not necessary in small samples.

Never find a true random sample

     I    II
a    1    2
b    2    3
c    3    4
d    4    5
e    5    6

patterns
I + 1 = II
basic notion between co-relation = relationship between 2 sets of
numbers.

pearson product + coefficient
r =
index 1.00     to   .00
     perfect        absence of relation
strength is determined by the size of the number not pos/neg
.50  or -.70
.7 is stronger



correlation does not insure causality, but causality does insure
correlation.

Measurement:

reliablity: consistency
validity: measuring what we were supposed to measure.
scores attributed to chance.
test retest reliablity:

     I    II
a    5    6
b    4    5
c    3    4    
d    2    3
e    1    2

correlation 
r11  or rtt

it is logical.  you give it, you gave bit again.  one instrument,
items are helpless.
Prob:
test familiarity, memory, fatigue, gap between test.

alternate form, equivalent:  same  type of test

27
84

parallel forms / alternate reliablity is valid in a speed test.
disadvantage : carryover effects, 
high correlation 


7/22/97
Reliablity tells us to waht extent scores on a test are nothing
but chace.  It is a statistic called variance.  Real difference
between .

Reliability is:
consistency
propostion of variance

To measure someone's IQ
true differences of 
correlation: if I get one and see a pattern I am measutring
something, maybe not what I want, but I am measuring something
real.
regression towards the mean
those in an extreme group have to move away from the extreme.

Raw Score(observed) = RS(true) + RS(error)

V = Variance

test whole group V(observed) =  V(true) + V(error)
----------------------------------------------------
                  V(O)             V(O)      V(O)

V(true)
-------  =  Reliability
V(O)      


varibility in those cores is something real but not necesaily
     what we want.

How do we improve reliability?
# of ques, adding more can add reliability, more ques more
     reliability
clearing up ambiguity of language in test: instructions,
     questions, answers


A test is reliable is dumb!!
consider groups, 
make the numbers talk for you even if you don't understand it.

estimating the error for a person giving one test
Standard Error of measurement.

mesaurements results and the things you tend to measure
is it testing what it is supposed to measure
and

see chap 27 for:
suppose I test and then I want to test validity of the test.  Ask
     other teachers called logical validity; is this set of
     samples relative to what is to be tested? 

Proficiency tests
criterion related validity. predictive & concurrent
selection aptitude tests, prediction type

when do content considerations come in to validity of GRE?
will the test do the job?

predictability


concurrent validity
same day as close together as possible.

Construct validity p420
construct could not be observed.  underlying idea: honesty,
     anxiety, test of intelligence which is a construct and
     tested reading is that a correlation? yes. 
big deal in personality type tests, are they valid for what they
     are being used for? 
we nibble away at, 
test is valid is valid for a specific purpose.  for one pop. 
raise reliability raise validity
cultural influences which could effect  validity

face validity: what test seems to be measuring which is not
validity.

stat inference

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
7/23/97

if your measures are no good, your report is no good.  Just
because someone publishes doesn't mean its good.   Don't assume
because it is published it is good. Have to look yourselves.

statistical inference:  

L    x    x2   r    x    x2
4    0    0    3    0    0
5    1    1    1    -2   4
3    -1   1    5    2    4
2    -2   4    2    -1   1
6    2    4    4    1    1
----------------------------
20        10   15        10
----------------------------
5         n-1  5         n-1

20/5=4=mean  s.d.=10/5-1=10/4=2.5
15/5=3=mean  s.d.=10/5-1=10/4=2.5

variance = s.d squared
V=s.d.2

V= 2.5


If you describe the variablity look at s.d.

we don't know how much is due to chance or indiv dif so we
     attribute all to chance.

diff is called all error

var of right side 

we have a measure of something other than treatments in each
group

we have measured left and right and then averaged the 2

means of l & r           x    x2
     3                   .5   .25
+    4                   -.5  .25
----------------------------------
     7/2 = 3.5 = mean         .5   *    5 =  2.5

The 5 above is the number of units.  so if the number of pop
tested I'd use the number tested.  It is an adjusting factor.


analysis of variance:
computed 2 means
are the groups different, cahnce will give diff
go thru proced to see variance
est of varibility on chance
if they are diff it will show up in means
is variability greater or is chance 
est of var as compared to est of chance so we are confiodent i
saying that the diff is not negligible.


more trad to report out mean and s.d.
but reporting out variance and mean is fine

in comparing 2 groups use variance.

don't take results at face value
In real life you need to know how to read the reports and hear
     the folks speak it.

computational formulas:
adding a constant to one group, equal addition to all.  fudging!!

     l    x2   r    x2
     6    36   3    9
     7    49   1    1
     5    25   5    25
     4    16   2    4
     8    64   4    16
------------------------------
Ex   30        15   55
_
X    6
Ex2       190

Ext = 30 +15 = 45        Sum of x total
Ex2 = 190 + 55 = 245     Sum of x squared

    (Ext)2     (45)2     2025
C = -------  =  ----  =  -----  = 202.5
      N          10       10

C = Sum of X total squared over N which is total # of pop

Total Sum of squares  SSt

SSt = Exs/t - C = 245 - 202.5 = 42.5

L is a correction term

          (Exl)2         (Exr)2
SSb  =    ------    +    -------   -  C
            nl             nr

Sum of squares between = Sum of total left squared over the
     number in left PLUS sum of total right squared over the
     number in right MINUS C

          30*30          15*15
SSb  =    ------    +    -----     -  202.5  =  22.5
            5              5       

degrees of freedom  = n-1

source    d.f.      ss        V
between   1         22.5      22.5
within    8         20        2.5
total     9         42.5      9=F

within is computed by subtracting between from total for d.f. and
ss  computing the V (variance) divide between/total

     F = Fisher

          Vb        22.5
F    =    ---  =    ---- =  9
          Vw        2.5

see pages 211-12

Choose two sets of 10 random numbers and compute this


If probability of an occurance of an event ocurs by chance alone
is only 5/100 then and such an event does occur we are going to
assume that it is not a chance occurance.

TEST:  Objective
sci and sci approach hypotheses and problems
sampling sum of means constructs
measurement: realibility and validity
stat: ideas not computation