How do you test a system that self optimize itself?
I have a system in which one of the core parts is required to work in a non deterministic fashion. In particular, some part of the system is behaving randomly (by design). Here is a small description:
- 90% of the time, we select only items that are above the score limit.
- Selection between those is made randomly, with no bias
- 10% of the time, we select only items that are _below_ the score limit.
- Selection among those is done based on the oldest last shown date
- To skip "rotten apples", we calculate the median of the lowest group and exclude all items that are below 20% of the median.
The idea behind this is to have a bare bone system that can adapt in response to input.
But I am not sure how I can test this in a way that will actually reveal something meaningful about the system.
Thoughts?
Comments
Hi,
What you describe sounds very much like a genetic algorithm based software. The bad new is that testing fuzzy logic system is a bitch. I don't think theres a good way to unit test the optimization process itself. In fact the internal behavior of such systems i.e. what make such algorithm tick, is the subject of many academic research (meaning that no one truly understands it)
You can try to measure out the end result of the optimization process by running a known set of test data through the system after the optimization. doing so will at least allow you to make sure that the optimization process reached a satisfactory point.
I can't quite see from your description where the self-optimization occurs. Where is the feedback? Does the score limit change, and is that deterministic?
If you want to find out how random your randomness is, then there are surely mathematical tools that will allow you to decide how random a selected subset is, if you have the original set and the subset.
Do you need to test the output? If so I would perform the test many times and then run statistics on the data returned.
Are 90% of the results > score limit
Are all of the results >= lower median-20%
I agree with Frank. I remember in one of our project to test the non deterministic behaviour we used the best case and worst case data samples and we make sure the tests passed all between them. I don't think there is any other way to tackle this problem. The tricky thing here is to find out best case and worst case samples.
Assuming you are talking about an Automated Trading System (ATS) where you find these things quite often. The problem with self optimizing with these systems is that they are optimizing to closely to match exactly the test data you are providing. So because of this often an out of sample data range is used to determine if the optimization is not becoming to optimized towards the test sample.
Now I still find this to optimize to much in some cases. So I am thinking of a different scenario, to make both the optimize data range and the verification data range random from a test dataset with a minimum time span, and the two ranges should not overlap more than 50%.
Hope this is somewhat in the direction you are thinking of?
-Mark
From your description, it sounds like you have combination of randomness and deterministic logic. Making randomness deterministic is not particularly hard: blogs.msdn.com/.../TestingAgainstRandomness.aspx
Having done that, you should be able to test the rest of the algorithm in a deterministic manner.
Since you are such a smart guy, you would obviously have thought of this yourself, so I probably misunderstood the problem...
Use Dependency Injection to supply the randomizer and ensure that the rest of the logic is deterinistic given a certain input set from the randomizer. Then you can mock the randomizer and ensure that the algorithm works correctly.
I'm guessing that you'd use a standard random number generation function provided by an API supplier and in many ways you'd have to take it as given that they've achieved their stated randomness.
Like most comments imply, that will test that the impleentation of the algorithm is as you had planned. Going beyond that to get a heuristic sense of whether the algorithm itself meets the design goals is a bit more tricky. Maybe this is what you were getting at?
Could you test a large set of data and then expect on the ratios?
-d
Comment preview