PSY202-221782401
From PsychWiki - A Collaborative Psychology Wiki
Concepts:
Dependent Variable
Definition: According to the textbook, a dependent variable is the variable that is measured.
Example: http://news.xinhuanet.com/english/2009-12/08/content_12609438.htm
Application: In the study, the researchers checked the DNA of 300 severely obese children for mutations in copy number variants (CNVs), which are large segments of DNA that are either copied or missing in our genes. Scientists suggests that this portion of the DNA play a vital role in the development of genetic diseases. The variable being measured in this article is the DNA from the sampled obese children.
Nominal Variable
Definition: According to the textbook, a nominal variable is a scale that is mutually exclusive and exhaustive categories differing in some qualitative aspect.
Example: http://www.newsweek.com/id/207842
Application: This article jokes about 7 ways to be "cool" at Comic Con. The 7 ways aren't in any particular order to a certain way being better than another way. The nominal variables in this article are the 7 ways to be cool at Comic Con.
Ordinal Variable
Definition: According to the textbook, an ordinal variable or scale has the properties of a nominal scale, but in addition the observations may be ranked in order of magnitude; with nothing implied about the difference between adjacent steps on the scale.
Example: http://www.newsweek.com/id/226015
Application: The article posted is about four reasons to be optimistic about the November job reports. The ordinal variable here is the order of the reasons the writer placed in the article.
Interval Variable
Definition: According to the textbook, an interval scale has all properties of an ordinal scale, and a given distance between measures has the same meaning anywhere on the scale.
Example: http://money.cnn.com/2009/12/07/markets/gold/index.htm?postversion=2009120715
Application: In this article, it talks about the rise of gold price during the past couple of months. This article shows a graph of the increase in the value of gold during the past year. The interval variable is the consistent distance given for the price of gold on the graph.
Ratio Variable
Definition: According to the textbook, a ratio scale has all the properties of an interval scale plus an absolute zero point.
Example: http://lakersblog.latimes.com/lakersblog/2009/12/lakers-beat-suns-10888.html
Application: In this article, it talks about the game score between the Lakers vs. Suns. The ratio scale starts with an absolute zero, and in this case, the points in basketball are made starting from 0 until the final points after the last quarter of the game. Every quarter in the NBA except for overtime will always be 12 minutes.
Frequency Distribution (regular, grouped, relative, or cummulative)
Definition: According to the textbook, frequency distribution shows the number of observations for the possible categories or score values in a set of data.
Example: http://brainstormtech.blogs.fortune.cnn.com/2009/12/04/ad-wars-droid-manly-iphone-girly/
Application: The article is about the main cell phone competitions between the iPhone and the Droid. A graph is given in this article showing a survey done for Oct, Nov, and Dec. on if the subject asked would recommend Blackberry phones, the iPhone from Apple, or the Droid from Motorola to his/her friend. The cell phone recommendation is the variable being observed.
Percentile (percentile or percentile rank)
Definition: According to the textbook, percentile or percentile rank is the percentage of cases in a distribution that falls below a given point on the measurement scale.
Example: http://www.cnn.com/2009/HEALTH/08/14/cocaine.traces.money/index.html
Application: This article is about how 90% of money in the US carry traces of cocaine. In this article, it includes a small chart on the side that shows the percentile rank of other countries (such as China, Brazil, etc.)and how much of their money show traces of cocaine.
Histogram
Definition: According to the textbook, a histogram is a graph that consists of a series of rectangles, the heights of which represent frequency or relative frequency.
Example: http://www.statcan.gc.ca/edu/power-pouvoir/ch9/images/histo1.gif
Application: The histogram shown is based on salary represented by the thousands and the number of employees that make the amount of salary shown on the histogram.
Frequency Polygon
Definition: According to the textbook, a frequency polygon is a graph that consists of a series of connected dots above the midpoint of each possible class interval (height of the dots corresponds to frequency or relative frequency).
Example: http://media.wiley.com/Lux/73/25673.nfg009.jpg
Application: The frequency polygon shown in the example is to display the items sold at a random garage sale. The Y-axis represents the frequency in the number of items sold at the garage sale, and the X-axis represents the price of the items.
Bar Diagram
Definition: According to the textbook, a bar diagram is used for qualitative data, a graph that is similar to a histogram, except that space appears between the rectangles.
Application: This blog shows a bar diagram on the amount of research money (external grants & contracts) rewarded to the UPEI from 2000 to 2006.
Pie Chart
Definition: According to the textbook, a pie chart is used for qualitative data, area in any piece of the pie shows a relative frequency of a category.
Example: http://www.warresisters.org/pages/piechart.htm
Application: In this website, it has a pie chart that shows qualitative data of "where your income tax money really goes" based on the year of 2009. The biggest piece of this pie chart is money being spent on "current military" of 36% and the smallest piece is "physical resources" of 5%.
Mean
Definition: According to the textbook, a mean is the sum of all the scores divided by the total number of scores.
Example: http://www.nba.com/lakers/stats/
Application: The website given as an example shows the statistics of the 2009-10 season of the Laker's. In the first chart labeled as "player averages", they calculate all of the different types of play and gives the team averages. For PPG (points per game), the mean of the points per game that the team scores is 104.6.
Median
Definition: According to the textbook, the median is the value that divides the distribution into halves; another name for P50.
Example: http://www.census.gov/hhes/www/income/statemedfaminc.html
Application: This example is regarding the first link of the given example (American Community Survey, State Median Income by Family Size). This excel file shows the median scores of family income in the past 12 months of 2008. It gives a median from 2-person families to 7-or-more person families. In California, for a 4-person family they have a median income of about $79,477.
Mode
Definition: According to the textbook the mode is the score that appears with the greatest frequency.
Example: http://www.imdb.com/chart/top
Application: In this website, it shows the top 250 movies voted by IMDb users. The number 1 movie ranked is The Shawshank Redemption. It got 458,317 votes. Since this movie got the most votes, the mode for this data would be the movie The Shawshank Redemption.
Range
Definition: According to the textbook, the range is the difference between the lowest score and the highest score.
Example: http://www.nba.com/lakers/stats/
Application: The data is based on the Laker's statistics of this season. The player average of points per game for the highest score is from Kobe Bryant, with the score of 28.9. The player average of points per game for the lowest score is from Sasha Vujacic, with the score of 2.0. Taking the difference between these two scores gives the range of 26.9.
Variance
Definition: According to the textbook, the variance is the mean of the squares of the deviation scores.
Example: http://answers.google.com/answers/threadview/id/782033.html
Application: In this Google question, someone asks about the expected return of two unknown stocks. In stock A, it shows that the variance of this stock is 0.001446 and in stock B, the variance is 0.014447. Therefore, stock B shows greater variance and the returns of this stock is bigger than stock A.
Standard Deviation
Definition: According to the textbook, the standard deviation is the square root of the variance.
Example: http://answers.google.com/answers/threadview/id/782033.html
Application: The calculated standard deviation for stock A is .0380, which is 3.8%. The standard deviation for stock B is 0.1202, which is 12.02%. Stock B shows much greater significance between the returns of the two stocks.
Standard Scores (z-scores)
Definition: According to the textbook, the standard scores (z-scores) all have a fixed mean and fixed standard deviation; one type of standard score.
Application: In this journal, it shows the calculation of z-score from the relationship of sports club participation with BMI (body mass index), and the sport motor function in children. For boys two main effects were significant: sports club and parents level of education. Boys being active in sports club showed lower BMI z-score (highest to lowest category: -0,22). These findings suggest a relationship between participation in sports club with BMI z-score and sport motor function.
Scatterplot
Definition: According to the textbook, the scatterplot is a graph of a bivariate distribution consisting of dots at the point of intersection of paired scores.
Example: http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/scatterplot-of-age-adjusted-death-ra
Application: These scatterplots are based on death rates based on cause of death. When selecting heart disease as the X-axis and hypertension as the Y-axis, the bigger the "stroke", the higher the hypertension. This particular scatterplot shows a positive relationship, which means an increase of death.
Correlation (r)
Definition: According to the textbook, correlation is a measure of the degree of relationship between two variables.
Example: http://answers.google.com/answers/threadview/id/782033.html
Application: This calculation is based on 2 unknown stocks. This example shows the calculation of standard deviation of stock A which is 0.0380 and the standard deviation of stock B which is 0.1202. When you multiply those two numbers and divide it by the covariance of 0.004539 from the expected return of stock A and B, the result of the correlation between the stocks are 0.9937.
EXTRA CREDIT: Correlation does not equal causation
Definition: According to the textbook, if variation in X causes variation in Y, that casual connection will appear in some degree of correlation between X and Y. However, we can not reason backward from a correlation to a causal relationship. The fact that X and Y vary together is a necessary but not a sufficient condition for us to conclude that there is a cause-and-effect connection between them.
Example: http://www.cnn.com/2009/HEALTH/10/29/kids.psychiatric.drugs.weight/index.html
Application: In this study, they are stating that schizophrenia drug causes weight gain in children. Their study will make readers believe that all children that are taking some sort of drug for schizophrenia will get fatter. This does not necessarily mean it is true in all cases because out of 205 children that participated in this study, only about half of the children gained more than 7% of their original body weight. There could be a factor that there are appetite increasing side-effects in some drugs. Another thing is that if only about half of the participants showed the results of "weight-gaining", then the level of confidence is not even near 90%. It could also be that genetically, the participants that gained weight may add body weight faster and easier than other people.
EXTRA CREDIT: GUIDELINE #7
Definition: "Check that results are fairly represented in graphics or concluding statements."
Example: http://www.warresisters.org/pages/piechart.htm
Application: The bar graph shown under red bold words that says, "ARE WE SAFE YET? shows a bar graph of the Cost of Iraq and Afghanistan wars. The bars for 2008 and 2009 show a section of gray in the bar. This misrepresents the graph because it does not explain why there is a different shape of the bar in the graph and what that represents.
EXTRA CREDIT: GUIDELINE #3
Definition: "Examine the sampling method to decide whether it is likely to produce a representative sample."
Example: http://www.cnn.com/2009/HEALTH/10/29/kids.psychiatric.drugs.weight/index.html
Application: The study done here is to see if Schizophrenic drugs cause weight gain in children and teens. The sample of the 205 participants are unknown to how the sample was gathered. It can be questioned if these 205 participants are representative for this study because we do not know their sampling method. Since we do not know if this sample is representative, we can not conclude that the data can be trusted.
EXTRA CREDIT: GUIDELINE #8
Definition: Stand back and consider the conclusions.
Example: http://www.cnn.com/2009/HEALTH/09/09/memory.boosters/index.html
Application: This article is based on "Ways to Boost Your Memory". Under the 30s, the article says, "Floss every day: What do loving licorice and hating the idea of flossing have in common? Both can contribute to plaque on your teeth, which is surprisingly bad for your brain. "The plaque between teeth can cause an immune reaction that attacks arteries, which then can't deliver vital nutrients to brain cells," says Dr. Michael Roizen, co-author of "YOU--The Owner's Manual: An Insider's Guide to the Body that Will Make You Healthier and Younger." Solution? Floss every day".
The doctor that came up with the idea that plaque can cause immune reaction.. How exactly did he find that plaque "can cause" this problem in our immune system and particularly targeted at attacking the "arteries"? There is no proof of how these conclusions were drawn. Everyone knows that flossing is good for your teeth, but I just don't understand how having plaque can attack my arteries.
EXTRA CREDIT: GUIDELINE #2
Definition: Consider the Source.
Example: http://www.cnn.com/2009/HEALTH/10/16/cheating.near.death/index.html
Application: This article talks about how "near death experiences are in the mind". In this article they said, "According to the Near Death Experience Research Foundation, nearly 800 near-death experiences happen every day in the United States". Considering the source of the "Near Death Experience Research Foundation", the conclusion that they came up with can be biased because how are near death experiences measured in United States exactly? That number is far estimated because I doubt any one that experiences a near death experience will contact the researchers to let them know they just had one.
EXTRA CREDIT: "LIE ON PURPOSE" Essay
The South Korean scientist Hwang Woo Suk and his colleagues reported that Dr. Hwang's group at Seoul National University created 11 lines of cloned human embryonic stem cells. After the paper was carefully reviewed by some editors in Science, they found no fabrication from the data given and decided to publish this breakthrough into the stem cell world. Shortly after, they found that this research was a fraud because the photos in the paper of the ES cell colonies are actually duplicates of the same stem cells. In addition, evidence shows that Hwang and his colleagues compensate donors of ova to use for their research, and that Hwang's junior members were donating their own ova for research.
The fraud was detected when a couple of other Korean scientists decided to look into this paper assuming that there had to have been fabrications in this paper. Also, another person that used to work with Hwang's team tipped off an investigative TV news program aired by the Seoul-based Munhwa Broadcasting Corp. to meet with him so that he can tell them the truth about this scandal. The only way this fraud could have been detected sooner is if Hwang and his colleagues told the truth about their unethical guidelines for research (using different donated ova for research) and publish their findings and procedures to the Institutional Review Broads for international stem cell scientists for a review before submitting their paper to Science for publishing. The researcher tried to suppress the information to how he obtained the human eggs for his studies because he paid the donators for this research and collected more than 1,600 oocytes, but he reported in the paper that he only used 427 eggs for the cloning research. Also, if other scientists or the public knew that Hwang collected eggs from some of his own researchers, then he knew he could be questioned or investigated for unethical methods for research.
This scandal may cause the public to question the findings of stem cell research scientists in the future because of how the editors of Science did not look deeper into the findings of this paper and assumed that they could trust the source just by reading and looking at the data submitted. Publishers of Science magazine, including other top research magazines, will especially become more cautious with reading paper submissions. Although there is not a proper way to protect fraud, this may improve research guidelines because reviewers and editors will pahy more attention to the incomplete experimental and administrative answers that is provided by the authors of submitted papers. Especially when it comes to important and great findings of claimed medical breakthroughs, scientists and editors should look at the surroundings and motivations of the author submitting the work. The motivation for fabrication this research could have been that many stem cell researchers around the world are all trying to find the answer that will bring us closer to the answers of how we can use stem cells to combat diseases, to bring money, and to become famous. From this, it is probable that Hwang and his colleagues wanted to be credited for unraveling the possible medical improvements from stem cells.
When a manuscript is submitted for publication, the editors of a journal are supposed to not assume that all submitted work from researchers should be trusted. They are also supposed to pay attention to the data in the paper and see that answers are not being answered. Also they are to find if the procedures and information in the journal are detailed enough for liability. To prevent the publication of fraudulent work, journals should examine papers by including especially high standards for providing primary data, a careful examination of the data being presented, and knowing the contributions of every author and coauthor involved in the paper being submitted. If a journal discovers the data within a paper it has published is erroneous, then it should immediately be investigated, retarded, and discredited.
◄ Back to Descriptive Statistics PSY 202 Fall 2009 page
