It's like, a distribution has a certain average, and certain variance among every trial. But when we do sampling, what we observe might be different. So in sampling distribution, we often 'recognize' the distribution first, then we predict what sampling might look like for each sampling, with sample size n=... and when we do the simulations a lot of times, we expect to get an average of it with the sampling mean(predicted). But the question is how do we truly recognize any distribution in the first place other than the Bernoulli trial? Even tossing coins is not 50 50 in real life because of some small dents and different angles.