Mar 03, 2018 Added more features. And continued polishing.
Mar 05, 2018 Added the Mode calculation.
Mar 06, 2018 Enhanced Mode, added FreqDist (frequency distribution), and enhanced error reporting.
----
I've started a similar CodeBank thread before, but I'm now thinking I went too complex, as there was no interest. Just looking around earlier today, I saw a request under a CodeBank entry by The Trick. At this point, I've addressed all the requests made by CreativeDreamer.
Basically, I've just provided some one-sample statistical functions. I've also made a decision on how to handle missing values. I've struggled with this in VB6. One option is certainly the use of Variant. However, I've never been terribly happy with that option. Therefore, I've decided on sticking with Double arrays for my data, and using the IEEE Double NaN value to denote missing values. This can be seen in the code.
Now, for the uninitiated, NaN values can be a bit tricky. They're somewhat similar to the Null value, but even more restrictive. Once you get a NaN, you can continue to do math with it, but the results will be NaN (similar to Null in Variants). However, you can't do Boolean comparisons with a NaN. In other words, they'll crash if used in an If statement. Therefore, anyone using these functions, needs to develop a practice of checking return values with the IsNan() function. This will keep you out of trouble.
Now, most of what I did is straight-forward. However, I did dip into calculating a p-value (and confidence intervals), which requires "distributions". I've leaned on the ALGLIB project to derive my PDF (probability distribution function [not portable document format]) and CDF (cumulative distrubution function) values.
I've attached a complete project. If you're interested, focus first on the modSimpleStats module. If you don't need p values or confidence intervals, you can delete the bottom portion (after the comment that reads "From here down requires the distributions"), and then you can also delete all the modules having to do with distributions. When I first put this up, I posted the code, but it's now over the 25,000 character post limit, so, you will need to review it in the attached sample project. Here's the comments from the top of the modSimpleStats module.
And, as stated, complete "run-able" project is attached. I've also designed a bit of a user-interface just for testing. Here's a sample of that (possibly without showing the very latest additions):
![Name: Stats.jpg
Views: 25
Size: 44.2 KB]()
Please feel free to make additional requests, and I'll possibly add them.
Take Care,
Elroy
Mar 05, 2018 Added the Mode calculation.
Mar 06, 2018 Enhanced Mode, added FreqDist (frequency distribution), and enhanced error reporting.
----
I've started a similar CodeBank thread before, but I'm now thinking I went too complex, as there was no interest. Just looking around earlier today, I saw a request under a CodeBank entry by The Trick. At this point, I've addressed all the requests made by CreativeDreamer.
Basically, I've just provided some one-sample statistical functions. I've also made a decision on how to handle missing values. I've struggled with this in VB6. One option is certainly the use of Variant. However, I've never been terribly happy with that option. Therefore, I've decided on sticking with Double arrays for my data, and using the IEEE Double NaN value to denote missing values. This can be seen in the code.
Now, for the uninitiated, NaN values can be a bit tricky. They're somewhat similar to the Null value, but even more restrictive. Once you get a NaN, you can continue to do math with it, but the results will be NaN (similar to Null in Variants). However, you can't do Boolean comparisons with a NaN. In other words, they'll crash if used in an If statement. Therefore, anyone using these functions, needs to develop a practice of checking return values with the IsNan() function. This will keep you out of trouble.
Now, most of what I did is straight-forward. However, I did dip into calculating a p-value (and confidence intervals), which requires "distributions". I've leaned on the ALGLIB project to derive my PDF (probability distribution function [not portable document format]) and CDF (cumulative distrubution function) values.
I've attached a complete project. If you're interested, focus first on the modSimpleStats module. If you don't need p values or confidence intervals, you can delete the bottom portion (after the comment that reads "From here down requires the distributions"), and then you can also delete all the modules having to do with distributions. When I first put this up, I posted the code, but it's now over the 25,000 character post limit, so, you will need to review it in the attached sample project. Here's the comments from the top of the modSimpleStats module.
Code:
Option Explicit
'
' List of "helper" procedures:
' NaN
' IsNaN
' ChangeMissingToNaN
' DblDims
' SortData
' FilterNaNs
'
' List is statistics procedures:
' Min
' Max
' Count
' Sum
' Mean
' Mode
' FreqDist
' SumSq
' SumSqDiff
' VariancePop or MeanSqPop
' VarianceSamp or MeanSqSamp
' StDevPop
' StDevSamp
' StErr
' OneSampleStudentT
' OneSampleTTestPValue
' OneSampleConfInt
'
' Quantile
' Percentile
' Quartile
' Median
' Range
' InterQuartileRange
'
Please feel free to make additional requests, and I'll possibly add them.
Take Care,
Elroy