As you can see, data points on this scatterplot are divided in two groups - one along horizontal line within Y values 4-8 and other along slanted line reaching Y=20.
- Could you suggest any statistical criteria to confirm/reject that we really have two datasets here instead of one?
- Are there any methods of separating these points to two datasets better then "by eye"?
- Are any of aforementioned methods implemented in StatSoft Statistica 8?
My best idea so far was to test the distribution of dependent variable (Y) for entire dataset and prove that it is not normal, then test two parts of dataset and prove that they are distributed normally.