1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Help me with Standard Deviation

Discussion in 'General' started by pscook, Oct 11, 2020.

  1. pscook

    pscook Well-Known Member

    I'm taking a Business Analytics "Nanodegree," and I'm stuck on the Standard deviation. I understand how to calculate it, but now that I have it, what do I do with it? I'm working with $B and $M, so my standard deviation works out to 802,407,338 from a total population of 20 values (no samples).
    Population Mean: 1,190,920,000

    I'm like the dog that caught the car: What am I supposed to do with this now that I have it?
     
  2. Phl218

    Phl218 .

  3. Dave K

    Dave K DaveK über alles!

    I sort glass, that's my job.
     
  4. CB186

    CB186 go f@ck yourself

    Under 10 SD shows your reloading practices are quite good. Under 20 isn't too bad. 50fps is 1moa@1000 yards.
     
    black knight and Gorilla George like this.
  5. nigel smith

    nigel smith Well-Known Member

    It's been a rather long time since I made a B in college statistics, but I think we need more information to answer your question. In business, knowing what the standard deviation was for a given population would probably allow you to make sound production or marketing decisions. What exactly did you compute the deviation of?
     
    SGVRider likes this.
  6. SGVRider

    SGVRider Well-Known Member

    You have to learn a lot of other concepts to understand its utility and how to use it. Knowing how to calculate a standard deviation on its own is pretty useless. Read up on the normal distribution and the empirical rule, and start going down the rabbit hole a bit. You would learn context from your professor if you were taking this as an in-person lecture. Is this an online course? If “WTF can I do with this?” wasn’t the first thing they covered then I don’t think that’s a very good course. You need to understand theory and concepts before you run off calculating things.

    If we don’t know what you’re measuring, we can’t know what utility a standard deviation would be to you. In your above example, your standard deviation seems pretty large compared to your mean. Using the standard deviation you calculated wouldn’t be useful in many situations. That distribution may not be even normally distributed, in which you’d need to use a non-parametric measure like median absolute deviation instead.

    To give a simple example of what use it is, you could apply it to the manufacture of motorcycle crankshafts. If the variance in the diameter of shafts produced from a certain process is normally distributed, understanding the standard deviation of diameters would be useful for many things. You could create a sampling process to ensure the manufacturing was within tolerance without measuring every crankshaft. You could understand what # of crankshafts were likely to be defective given a production run of N units. You could then measure waste against the cost of improving your manufacturing process, etc. You could look for heteroskedasticity and identify degradation in your manufacturing process over time. Lots of things are possible, if you have context.

    I don’t know what your goals are, but I would definitely recommend investing the time to learning this stuff well. Sampling and regression are EXTREMELY powerful tools in business and many other endeavors.

    I’d like to go on a rant about math education in America, but I’ll spare you.
     
    Last edited: Oct 11, 2020
    tophyr and RichB like this.
  7. pscook

    pscook Well-Known Member

    I'm running the numbers on four years of NYSE data, specifically R&D expenditures of five different Industrial conglomerates. My "question" is to find what correlation exists between R&D spending and Operating Profit. One of the tasks is to calculate and use Standard Deviation and measures of spread in the presentation.

    So I calculated the Standard Dev, and... I don't know what to do next. Everything that I have looked at online is basically "here's how to do it" not "here's what you do with it." At least what I have found.

    I'm going back through the course videos (yes online only) to see if I can find the gaping hole in my knowledge. I'm assuming it's me, because everything else has been very thorough. I just need this little nugget to give ma the "aha!" moment so I can wrap up this project.
     
  8. dsapsis

    dsapsis El Jefe de los Monos

    In this case (as I understand the context) the SD value is simply is a metric of variance in R&D spending(which in this example is huge related to mean R&D per firm -- take from that R&D investment is all over the map -- you should be able to tell this just looking at raw numbers). It doesnt directly relate to any correlation analysis you need to do between R&D values and profits (which presumably you also have). Indirectly , the issue of residuals and constant variance assumptions would apply in a basic linear fit model.
     
    Phl218 likes this.
  9. SGVRider

    SGVRider Well-Known Member

    You need to normalize the values first. Raw values are meaningless here because you're comparing different entities. In financial analysis, we usually use a common size statement for normalization. If the question specified another method, use that. Calculating a common size statement is simple, on every income statement you're going to convert each line as its value relative to gross revenue. You need to do this in order to compare companies of disparate sizes.

    I assume by operating income they mean net income, so we'll call that ratio net income margin (NIM). R&D spending margin, and R&D/NI for each firm-period. You can calculate the mean and standard deviation of the R&D/NI ratio. That'll show what those conglomerates have settled on. It's only a measure of a single variable though. You need to do something else to illustrate the relationship. I would use regression to calculate the relationship between R&D spending margin and NIM, but I don't know if they've asked for that. If it's primarily supposed to be a data visualization exercise, you can just plot this stuff to show the relationships.
     
    Last edited: Oct 11, 2020
    tophyr and Phl218 like this.
  10. Chino52405

    Chino52405 Well-Known Member

    Lott Industries? I thought those commercials were just a local thing growing up.
     
  11. SuddenBraking

    SuddenBraking The Iron Price

    You'd almost certainly have to lag the operating profit (we'll call it EBITDA to oversimplify) to truly see a causation/correlation between that and the prior year(s)R&D expenditures.

    If you let us know what you're trying to glean from this data it'd be much easier to tell you how to get there.
     
    beac83 likes this.
  12. dsapsis

    dsapsis El Jefe de los Monos

    I am guessing this is a stepwise exercise. After he makes the case that R&D are indeed all over the map (making their comparison to other firm metrics problematic), what do you do? Then you would go through a normalization (or other data transformation) process. Then more issues, like lag as Sudden mentions.
    *these* (I kid, sounds like you know basic business stats a lot better than me. I work with crazy nature. )
     
    SuddenBraking likes this.
  13. auminer

    auminer Renaissance Redneck

    Also correlate legal budget to determine what operating profit is derived from patent infringement from smaller companies that can't match your legal budget. *cough*cough*Fraudcomm*cough*cough
     
  14. pscook

    pscook Well-Known Member

    So, the details are in the fact that we are presented with a dataset and guided to ask our own questions, and then just calculate some numbers to show that we understand the process. I'm hung up on the fact that I submitted the project twice now, and I have received feedback to explicitly address the Standard Deviation as well as the range to show understanding of the concepts. However, I have no idea how to generate insights on the standard deviation from my data. I always understood that anything close to 1 (or -1) is good, while anything greater than that means the data is skewed. But what does it mean when my number is in the millions? Do I simplify the numbers, like standard deviation is 0.8 and mean is 1.2? I'm completely lost on how to generate any insights on this data (specifically) or these data (generally). The box in yellow I think is general, as my deviation isn't 2, but I think that my standard deviation is pretty close to the mean, but I don't know how to validate it. The range is 24 (simplified from 2,380,000,000) compared to the mean of 12 (simplified, again).

    upload_2020-10-11_16-39-15.png
    upload_2020-10-11_16-39-36.png
     
  15. SGVRider

    SGVRider Well-Known Member

    You are very, very confused my friend.

    1. Standard deviation has nothing to do with the skewness of a distribution. It also has an infinite (positive) range, because it’s derived from the underlying values in your dataset which have an infinite domain.

    2. Standard deviation is best conceptualized as the average distance of a value from the mean. IE, how much spread there is in the observed values. By definition, the value can never be negative.

    3. Why are you calculating statistics based on a time series from different companies with the raw values? This doesn’t make any sense to me as it would produce completely meaningless numbers. Doing it on financial ratios makes much more sense.

    4. A standard deviation can’t be “good” or “bad” on its own. It’s descriptive. It’s either useful or not useful.
     
  16. pscook

    pscook Well-Known Member

    Yes, I agree. I am very confused. However, I'm just trying to get through this project. To answer to point 3, I'm reluctant to tear down my project and start over as I've built all of the other artifacts, and I'm just down to showing that I understand the terms. Which I obviously don't.

    I'll keep studying and see if I can get it together after a good night's sleep.
     
  17. YamahaRick

    YamahaRick Yamaha Two Stroke Czar

  18. 5axis

    5axis Well-Known Member

    Standard deviation uses a feather
    Un-standard deviation uses the whole chicken.
     
  19. Monsterdood

    Monsterdood Well-Known Member

    Here's the simple version. If the you have a bunch of numbers all grouped together, the standard deviation will be small. If they are spread out, the standard deviation will be large. So the mean describes the average of a bunch of numbers and the standard deviation describes how close they all are. If you divide the standard deviation by the mean, you normalize the standard deviation into something you can compare to see if numbers are tight together or spread out. Hope that helps...
     
    pscook likes this.
  20. RichB

    RichB Well-Known Member

    From my work perspective std deviation also represents uncertainty. But in these academic exercises there's usually a bunch of tools that can be applied, and a few that might actually be relevant to the question. For me, no harm in discussing the strengths and weaknesses of an SD analysis, what you thought it might elicit, support, etc. And what it didn't do. However based on outcomes / limitations of data available etc. you have determined other tools/analyses are better suited to the question at hand. Otherwise if it's essential, go back through your notes and look at how the lecturer applied it, there might be a specific angle or concept they want covered or at least addressed to show you have considered it.
     

Share This Page