Wikipedia:Reference desk/Archives/Mathematics/2019 August 26

Source: Wikipedia, the free encyclopedia.
Mathematics desk
< August 25 << Jul | August | Sep >> August 27 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


August 26

Bell curve formula with skew

I am using the following formula for a bell curve (please feel free to format this so it looks good): y = (1/(s*sqrt(2*p)))*(e^(-((x-a)^2)/((2*s)^2)), where a=mean, s=standard deviation, p=pi, and e=euler. In case I typo'd, this is the actual code:

ls = 1/(s*sqrt(2*p));
dn = pow(2*s,2);
y = ls*pow(e, -(pow(x-a,2)/dn));

What I want to do is add a skew to it. For example, if I had a skew from -1 to 1, I could say that -1 places the mean pretty much all the way to the left side. A skew of 1 will place it almost all the way at the right side. A skew of 0 will be a normal bell curve. Is there a commonly accepted formula that includes this type of skew that I can add? 199.164.8.1 (talk) 13:58, 26 August 2019 (UTC)[reply]

A bit off-topic: the parentheses don't seem to balance. Possibly you meant: y = (1/(s*sqrt(2*p)))*e^(-((x-a)^2)/((2*s)^2)).
BTW, wouldn't it be easier to say y = (1/(s*sqrt(2*p)))*exp(-((x-a)/(2*s))^2) ...? --CiaPan (talk) 15:01, 26 August 2019 (UTC)[reply]
That is not the optimized code, so your simplification will work. In the optimized code, dn = 1/dn. Then, in the equation, I have *dn instead of /dn. I have to loop through about four million values of x to get the value of y. The optimization replaces millions of floating point divisions with floating point multiplication for a noticeable reduction in clock cycles. What isn't shown is that I'm editing a Fortran procedure, so optimization is manual. 199.164.8.1 (talk) 15:46, 26 August 2019 (UTC)[reply]
You may want the skew normal distribution, although it's not necessarily the only possible generalization. Add: yeah, a bit more is mentioned at that article; see also the exponentially modified Gaussian distribution. Different options are going to have different properties, and which you'd want probably depends on what you're trying to do. –Deacon Vorbis (carbon • videos) 15:08, 26 August 2019 (UTC)[reply]
I read through that, but didn't see anything that I could use. I will look again. For example, I saw that I could multiply y by dy(x)/dy. But, "dy(x)" is not a appliable formula for computer programming (and neither is dx). I have to differentiate the equation first. That made me wonder if there was a standard solution instead of differentiating the equation. 199.164.8.1 (talk) 15:48, 26 August 2019 (UTC)[reply]
I have a suggestion. The higher the value of y, the more you want it to be offset along the x-axis, correct ? "a" is the initial value of that offset. So, after you calculate the value of y using the regular formula, try recalculating it, but this time, in addition to "a", plug in the value of y from the previous calculation (which I will call y0), multiplied by some skew factor, k, which may be positive or negative. Like so:
Here's a skew with k = +0.5 and -0.5, skewed to each side, with a = 1 and the rest of the formula simplified so it would fit in window.
y0 = (1/(s*sqrt(2*p)))*e^(-((x-a)^2)/((2*s)^2))
y = (1/(s*sqrt(2*p)))*e^(-((x-a-k*y0)^2)/((2*s)^2))
Or, to put it in terms of your code:
ls = 1/(s*sqrt(2*p));
dn = pow(2*s,2);
y0 = ls*pow(e, -(pow(x-a,2)/dn));
y = ls*pow(e, -(pow(x-a-k*y0,2)/dn));

Try that out and see if it does what you need. (Note that the area under the curve is reduced by this skewing effect, with more skewing causing more reduction. This is because the highest part of the curve is more narrow, while the maximum height remains the same. Is that OK ? The skew normal distribution method increased the maximum, to avoid this effect.) SinisterLefty (talk) 04:31, 27 August 2019 (UTC)[reply]
You could have different standard deviations on each side. By definition, each side has 50% of the population. But, one side will be compact and the other will be long. 2600:1004:B053:6718:9C3E:1FC5:DD8B:B2CA (talk) 14:07, 27 August 2019 (UTC)[reply]
One possible definition of a "skewed normal distribution" is one that was normal, but no longer is, after having been skewed. So, the usage is similar to a dwarf planet, which isn't actually a planet. From the OP's description, this is what I think they meant. If so, then the 50% on each side condition would no longer apply. SinisterLefty (talk) 16:19, 27 August 2019 (UTC)[reply]