Psychological Assessment Reviewer PDF

Title	Psychological Assessment Reviewer
Author	Ina Cabatino
Course	BS Psychology
Institution	Xavier University School of Medicine
Pages	33
File Size	1.1 MB
File Type	PDF
Total Downloads	344
Total Views	428

Preview

CLICK TO PREVIEW PDF

Summary

Download Psychological Assessment Reviewer PDF

Description

PSYCHOLOGICAL ASSESSMENT GENERAL CONCEPTS Uses:



1. Measure differences between individuals or between reactions of the same individual under different circumstances 2. Detection of intellectual difficulties, severe emotional problems, and behavioral disorders 3. Classification of students according to type of instruction, slow and fast learners, educational and occupational counseling, selection of applicants for professional schools 4. Individual counseling – educational and vocational plans, emotional well-being, effective interpersonal relations, enhance understanding and personal development, aid in decisionmaking 5. Basic research – nature and extent of individual differences, psychological traits, group differences, identification of biological and cultural factors 6. Investigating problems such as developmental changes in the lifespan, effectiveness of educational interventions, psychotherapy outcomes, community program impact assessment, influence of environment on performance Measures broad aptitudes to specific skills

Features of a psychological test:  Sample of behavior o Objective and standardized measure of behavior o Diagnostic or predictive value depends on how much it is an indicator of relatively broad and significant areas of behavior o Tests alone are not enough – it has to be empirically demonstrated that test performance is related to the skill set for which he or she is tested o Tests need not resemble closely the behavior trying to be predicted o Prediction – assumes that the performance of the individual in the test generalizes to other situations o Capacity – can tests measure “potential”?  Only in the sense that present behavior can be used as an indicator of future behavior? o No psychological test can do more than measure behavior  Standardization o Uniformity of procedure when administering and scoring a test o Testing conditions must be the same for all o Establishing norms (normal or average performance of others who took the same test under the same conditions) o Raw scores are meaningless unless evaluated against suitable interpretative data o Standardization sample – indicates average performance and frequency of deviating by varying degrees from the average  Indicates position with reference to all others who took the test  In personality tests, indicates scores typically obtained by average persons  Objective measurement of difficulty o Objective – scores remain the same regardless of examiner characteristics o Difficulty – items passed by the most number of people are the easiest





Reliability o Consistency of scores obtained when retested with the same test or with an equivalent form of test Validity o Degree to which the test measures what it’s supposed to measure o Requires independent, external criteria against which the test is evaluated o Validity coefficient – determines how closely the criterion performance can be predicted from the test score  Low VC – low correspondence between test performance and criterion  High VC – high correspondence between test performance and criterion o Broader tests must be validated against accumulated data based on different investigations o Validity is first established on a representative sample of test takers before it is ready for use  Tells us what the test is measuring  Tells us the extent to which we know what the test measures

Guidelines in the Use of Psychological Tests  General: prevent the misinterpretation and misuse of the test to avoid: o Rendering test invalid; and o Hurting the individual  A qualified examiner needs to: o Select, administer, score, and interpret the test o Evaluate validity, reliability, difficulty level, and norms o Be familiar with standardized instructions and conditions o Understand the test, test-taker, and testing conditions o Remember that scores obtained can only be interpreted wit reference to the specific procedure used to validate the test o Obtain some background data in order to interpret the score o Obtain information on other special factors that influenced the score  The test user is anyone who uses test scores to arrive at decisions o Most frequent cause of misuse: insufficient or faulty knowledge about the test  Ensure the security of test content and communication o Need to forestall deliberate efforts to fake scores o Need to communicate in order to:  Dispel the mystery surrounding the test and correct prevalent misconceptions  Present relevant data about reliability, validity, and other psychometric properties  Familiarize test-takers about procedures, dispel anxiety, an ensure that best performance is given  Feedback regarding test performance  The test administration: o Should help predict how the client will behave outside the testing situation o Influences specific to the testing situation introduces error variance and reduces test validity o Examiners need to memorize exact verbal instructions, prepare test materials, and familiarize themselves with specific testing procedure  The testing conditions o Suitable testing room with adequate lighting, ventilation, seating facilities, and work space

Implications of details during testing (e.g. improvised answer sheet, paper-and-pencil vs. computer, familiar examiner vs. stranger) o Need to:  Follow testing conditions to the most minute detail  Record unusual testing conditions  Take testing conditions into account when interpreting test results Some examiners may deviate from procedure to extract more information. However, scores obtained this way can no longer be compared to the norm. Establish rapport o Examiner’s efforts to arouse interest in the test, elicit cooperation, and encourage them to respond in a manner appropriate to the test objectives o Any deviation from standard motivating conditions should be noted and used for interpretation o Maximizing rapport:  Maintain a friendly, cheerful, and relaxed manner  Consider examinee characteristics (e.g. for children, consider presenting the test as a game, have brief test periods)  Be sensitive to special difficulties  Give reassurance – no one is expected to finish or to get all items correctly (every test implies a threat to a person’s prestige)  Eliminate surprise by:  Explaining purpose and nature of test  Giving general suggestions on how to take the test  Giving and answering sample items  Convince them that is in their own interest to obtain a valid and reliable score (e.g. avoiding waste of time, arriving at correct decisions) Examiner and Situational Variables o E.g. age, sex, ethnicity, professional or socioeconomic status, training and experience, personality characteristics o Manner: warm vs. cold, rigid vs. normal o Testing variables: nature of test, purpose of testing, instructions given to test-takers o Examiner’s non-verbal behavior (e.g. facial or postural cues) o Test-taker: activities preceding the task, receiving feedback o In case these situations cannot be controlled, qualify this in the feedback / report Training effects o Coaching – close resemblance of test content and coaching material  Leads to improvement of test scores  BUT, since coaching is restricted to specific test content, there is low generalizability of improvement to other criteria o Test sophistication – repeated testing experience introduces an advantage over first-time test-takers  Need orientation and practice to equalize o It’s more effective to train on broad cognitive skills such as:  Effective problem-solving  Analysis of problems / questions  Consideration of alternatives, details, and implications of a solution  Deliberate formulation of a solution o

 







Application of high standards to evaluate performance

NORMS AND THE MEANING OF TEST SCORES

– The raw score is converted into a derived score, which:  Measures persons  Permits  Can be expressed in terms of o attained; or o

– performance in reference to other

STATISTICAL CONCEPTS  Statistics – used to o facilitate and understanding of it  Frequency distribution – and counting how often a score falling in the class interval appears within the data  Normal curve features: o Largest number of cases cluster in the center of the range o Number drops gradually in both directions as extremes are approached o – 50% of cases fall to the left and to the right of the o Single peak in the center



 

Central tendency – to characterize the performance of an entire group o Mean – add all scores and divide by total number of cases o Mode – midpoint of the class interval with the highest frequency; highest point on the distribution curve o Median – score when all scores have been arranged from smallest to largest Variability – around the central tendency Range – e



 Deviation – difference between an individual’s score and the mean of the group (x = X - M) Standard deviation – square root of the variance; s o higher standard deviation means more individual differences (variation)

DEVELOPMENTAL NORMS  Basal age – highest age at and below which all tests were passed  Mental age – basal age + partial credits in months for tests passed above basal age-level tests o Mental age unit shrinks correspondingly with age  Grade equivalent – mean raw score obtained by children in each grade o Disadvantages:  Appropriate only for common subjects taught across grade levels (e.g. not applicable for high school level)  Emphasis on different subjects may vary from grade to grade  Grade norms are not performance standards  Ordinal scales – sequential patterning of early behavior development o Developmental stages follow a constant order; each stage presupposes mastery of an earlier stage WITHIN-GROUP NORMS  Percentile – percentage of persons who fall below a given raw score o Indicates person’s relative position in the standardization sample o The lower the percentile, the lower the standing o Advantages:  Easy to compute  Can be easily understood  Universally applicable o Disadvantage: inequality of units  

Standard score – individual’s distance from the mean in terms of standard deviation units o – retain exact numerical relations of original raw scores  Subtract constant, divide by constant  Also called 𝑋−𝜇 𝑧= 𝜎 Where X = raw score, µ = mean, and σ = standard deviation – fit scores to any specified distribution curve (usually normal

o curve) 



– distribution that has been transformed to fit the normal curve  Compute the percentage of persons falling at or above each raw score  Locate percentage in the normal curve  Obtain normalized standard scre Example: A score of -1 means that person surpassed approximately 16% of the group (because the distance from -3σ to -2σ is equal to 2.14, + distance from -2σ to 1σ id 13.59 = 13.59 + 2.14 = 15.73

   



– (normalized standard score) x 10 ± 50 µ = 50, σ = 10 – also called “standard nine” µ = 5, σ = 2 Stanine Percentage 1 First 4% 2 Next 7% 3 Next 12% 4 Next 17% 5 Next 20% 6 Next 17% 7 Next 12% 8 Next 7% 9 Last 4%

Deviation IQ o IQ = ratio of mental age to chronological age  if IQ = 100, mental age = chronological age 𝑚𝑒𝑛𝑡𝑎𝑙 𝑎𝑔𝑒 𝐼𝑄 = 𝑐ℎ𝑟𝑜𝑛𝑜𝑙𝑜𝑔𝑖𝑐𝑎𝑙 𝑎𝑔𝑒 o standard score with µ = 100 and σ = 15 (or 16, depending on the test) o DIQ is only comparable across tests if they have the same mean and standard deviation

Relativity of Norms  IQ should always be accompanied by the name of the test  Individual’s standing may be misrepresented if inappropriate norms are used  Sources of variation across tests: o Test content o Scale units of mean and standard deviation o Composition of standardization sample  Normative sample – ideally, a representative cross-section of the population for which the test is designed o Sample – group of persons actually tested o Population – larger but similarly constituted group from which the sample is drawn o Should be large enough to provide stable values o Should be representative of the population under consideration  Else, restrict the population to fit the sample (redefine population) o Should consider specific influences affecting the normative sample Anchor Norms – used to work out equivalency between tests  Equipercentile method – scores are equivalent if they have equal percentiles on two tests (e.g. 80th percentile in Test A = IQ of 115, 80th percentile in Test B = IQ of 120, therefore Test A’s 115 is Test B’s 120) Specific norms – tests are standardized to more specific populations to suit the purpose of the test (a.k.a. subgroup, local norms) Fixed reference group – referred to for comparability and continuity

o o

An independently sampled group against which future test scores are compared Updated via anchor test (or list of common items) that have items occurring in the original reference group. Adjustments are made based on comparing frequency of correct answers on common items of previous group and present group

DOMAIN-REFERENCED TEST INTERPRETATION  Aka “criterion-referenced” testing  Reference is content domain rather than a group of persons  Tests mastery of specific content (what can the client do?)  Content meaning – focus on what they can do vs. how they compare with others  Should have content that is widely-recognized as important  Should have items that sample each objective  Best used for testing basic skills at elementary levels  Mastery testing – if individual has or has not obtained a pre-established level of mastery o Individual differences is of little or no importance o Impractical for content beyond elementary skills because of differing levels of achievement, instruction  Tests need to have critical variables required for performance of certain functions  Efforts should be made to address limitations of a single test score o Cutoff should be a band of scores rather than a single score on one administration of the test o Should be dependent on other sources of information o Both test construction and content experts should decide on cutoff scores o Score should be established on empirical data  Expectancy Table – probability of different criterion outcomes given a score Score Count Lower than D C B A Lower than 10 14 43 36 14 7 10-19 71 47 37 24 3 20-29 104 9 21 43 27 30 or higher 22 5 0 36 59 OR Score 9 8 7 6 5 4 3 2 1 

Percentage eliminated during training 4% 10% 14% 22% 30% 40% 53% 57% 77%

Gives general idea about the validity of the test in predicting a criterion

RELIABILITY RELIABILITY  Consistency of scores obtained by the same person across time, items, or other test conditions  esent “true” differences or chance errors  Estimate what proportion of test score variance is error variance o Error variance – difference in scores resulting from conditions that are irrelevant to the purpose of the test  No test is a perfectly reliable instrument CORRELATION COEFFICIENT  Expresses the degree of relationship between two scores  Zero correlation indicates the total absence of a relationship  – accounts for individual’s position in the group and the amount of deviation from the mean ∑ 𝑥𝑦 𝑟𝑥𝑦 = (𝑁)(𝑆𝐷𝑥 )(𝑆𝐷𝑦 )





Where Σxy = sum of score 1 x score 2 N = total number of cases SDx = SD of Test X SDy = SD of Test Y Statistical significance – whether findings in the sample can be generalized to the population o “significant at the .01 level” = there is only about 1 out of 100 chance that the findings in the sample is wrong (i.e. only 1 in 100 chance that the correlation is actually 0). o Significance level – risk of error we’re willing to take in drawing conclusions from our data o Confidence interval – range of score under which the true score might fall given a specified level of confidence Reliability coefficient – use of correlation coefficient for psychometric properties o Level of 0

TYPES OF RELIABILITY Test-Retest Reliability  Repeat same test on the same person on another occasion   –  Shows how test can be generalized across situations  Higher reliability, lower susceptibility to random changes  Need to specify length of interval  Interval rarely exceeds 6 onths   sensorimotor, motor) Alternate-Form Reliability

s

(e.g.

    

 

Test for correlation of scores on the two forms Measure of both to tw different item samples (to what extent does performance depend on specific items or arrangement of the test?) Parallel forms must: o Be independently constructed; o Items should be expressed in the same form; o Same type of content; o Equivalent range and level of difficulty; o Instructions, time limits, and sample items must be equivalent Disadvantage Questionable: degree of change in the test due to repetition (e.g. insight tasks)

Split-Half Reliability  Two scores are obtained by dividing it into equivalent halves   Test fo  Single administration of a single form   – for o Used because this type of reliability only technically computes for the reliability of half the test

  

   

(a.k.a. Single administration of a single form Source of error variance: o o More s However, is a homogenous test appropriate for a heterogeneous psychological construct? Is the criterion being predicted homogenous or heterogeneous? Unless items are highly homogenous, the KR coefficient will be lower than S-H

Scorer Reliability  Factors excluded from error variance: o remains in scores) o  Correlate results obtained by two separate scorers Interpreting reliability scores:  1  Analysis of error variance: o Reliability from delayed alternate forms = 1 - .70 = .30 (content + time)

o o o o o      

Reliability from split-half = 1 - .80 = .20 (content) Error variance due to time: .30 - .20 = .10 (time) Scorer Reliability = 1 - .92 = .08 (interscorer) Error variance = .30 + .20 + .08 = .38 True variance = 1 - .38 = .62

– – No one can complete the tests

Standard Error of Measurement  Also expresses  Interval between which the true score may lie (obtained score ± 1 SEM)

VALIDITY Validity    Content-Description Procedures (Content Validity)  Systematic examination of the test to evaluate if it covers a representative sample of behavior to be tested  Content must be broadly defined to cover major objectives  Important to  – content areas or topics to be covered, objectives, importance of topics, number of items per topic o More appropriate for o Does the test cover a representative sample of specified skills and knowledge? o Is test performance free from irrelevant variables?  Face validity – whether the test “looks valid” to test-takers and other technically untrained observers o Desirable feature of a test but should not be a substitute for other types of validity Criterion-related Validity  Indicate test’s effectiveness in predicting performance in specified activities  Not about time, but about objective of testing  – used t (Does person qualify or the job?) o Criterion data is already available  – used to (Does person have the pre-requisites to do well in a job?)  Avoid criterion contamination (e.g. rater’s knowledge of test contaminates criterion ratings)



Criterion measure examples: academic achievement, performance in training, actual job performance, contrasted groups (extremes of distribution of criterion measures); psychiatric diagnoses, ratings by authority, correlation between new test and previously-available test

Construct Validity  Extent to which a test measures a theoretical construct or trait  Evidence includes research on nature of the trait and the conditions affecting development and manifestation  Age differentiation – used in traditional intelligence tests  Correlation with other tests – new test measures approximately the same behavior as the previous test o  Factorial validity –  Internal consistency – measure of homogeneity o...