Often after you make a scatterplot, the first thing that you ordered user will ask is “so is there a relationship between the variables?” Just kidding! They will often assume that there is one without making it back with something explicit. That can be dangerous, because the human mind is wired to see patterns, even when they do not exist. It’s important to understand what the data is truly telling us, and to communicate that is explicitly of possible so that everybody ends up on the same page. A best fit line, particularly with confidence intervals, it’s valuable for just that. But her and matplotlib does not have functionality to do that directly, because it is a plotting library, and this steps over into the realm of analysis. Seaborn does have a more liberal view of its domain than matplotlib, and includes both a best fit line, confidence interval, and fit coefficients as a standard part of it scatterplot. For that reason, it is probably the best solution to this quick problem.
other ways to add a best fit line
There are several other solutions as well. Pandas has a ordinary least squares function, that can be used to get the slope and intercept for such a line. The advantage of that is that you have those parameters on hand for any other calculations but you want to do outside of the graph. You can also bring to bear the Full power of SKL which has a linear regression function. What’s nice about that, if it if you decide that it would be better represented as a polynomial, that is a trivial addition to make to the Skl function arguments. This makes it easy to explore, and potential he swapping entirely different models.