A common problem with scatterplots is that you end up with all of the data points overlapping on each other, Which means that the user can’t get a sense of where the true density of the data life. For example, they may not be able to see whether there is just one “layer” of data, or whether the points are stacked on top of each other 10 deep.
Solution to overlapping
There are a few solutions to this problem of overlapping data points. The simplest is to simply set the “alpha” parameter down to something like .25, which means that each of the points is see-through. The color will be much later if there is only One data point in the space, and the viewer will be able to discern up to 4 points overlapping. This can get annoying, particularly when there are some outlier points but do not overlap and get difficult to see. It doesn’t look amazing, and there are often better representation.
Another simple approach that can sometimes work is to take a sample of the data, down to the point where the data points are rarely overlapping. There are several obvious drawbacks. One is that outlier data may not make it into the sample, suggesting a smaller minimum and maximum spread then you’ll see in the actual date set whether this is a bug or a future depends on what you’re trying to communicate. If you believe that the outliers are truly a relevant, or at least distracting more than informative, this could be a very valuable addition. On the other hand, understanding the minimum for next month is critical, you may want to leave them in.
This is one of several areas where the SEa FUMBORN space package shines it has a function that allows you to do a “density plot”. That arranges the whole plotting space into hexagonal spaces. You were no longer plotting individual points, but the number of units that fall into you today, which get darker when there are more units in them. This makes for a much nicer representation of the same concept as the Alpha approach that I introduced earlier. It tends to look more professional, and front of your stakeholders will probably be very impressed with this interesting, but easy to grasp, new visualization. Even cooler, if somewhat more difficult to understand, is the Contor, which leaves out the gradations of density, much like a map you would see when you were hiking. It takes a little bit of explanation, but it does look pretty often, particularly if you choose a good color scheme. That said, you have to always remember that the goal of a graph is to explain a concept, and the conference can be distracting, particularly if people are trying to interpret them like a bubble chart.