Please enable JavaScript.

Coggle requires JavaScript to display documents.

Stochastic Processes - Coggle Diagram

- - - - Average queue length: \(L_q= \sum_{i=1}^\infty(i-1)\pi_i = \frac {\rho^2} {1-\rho}\). Average waiting time: \(W_q = \frac{\rho}{\mu(1-\rho)}\). Average number of customers in the system: \(L = \frac{\rho}{1-\rho}\)
        
        Poisson Arrival See Time Averages (PASTA): for a poisson arrival process, the the long run number of customers that find the system in state i \( = a_i = \pi_i\)
        
        The probability that a customer has to wait at most x time is \(W_q(x)=1-\rho e^{-\mu(1-\rho)}\) (for birth-death processes i think)
        
        for a birth-death process, we get that: \(p_i = \frac{\lambda_i \pi_i}{\sum_{j \in I}\lambda_j \pi_j}\). basically the probability that a customer finds the queue in state i is the in flow of customers in that states/ in flow of customers ever. This is because it is those customers that enter in that state which see the queue in that state.
    - - this is also the fraction of time that the server is busy
      - The fraction of customers that must wait is the same as the server utilization
  - - - With exponentially distributed service times the service coefficient \(c_B\) is 1
    - - There are some nasty formulas here, but I won't bother writing them here. It only assumes that customers of some type are serviced before other.
  - - - learn the ways to use littles law
        
        Euler flow +traffic equations.
        
        Relationship between B and C function
        
        Bathroom situation at concert
        
        how it go in master
        
        when does pasta not hold
  - - - \(\rho = \frac{\lambda}{c\mu}\)
        
        by little's law: avg num of busy servers = \(c\rho\)
        
        this means that \(\rho\) is the amount of time a single server is busy
        
        In the analysis we define: \(C(c,\rho) = \frac{(c\rho)^c}{c!(1 - \rho)}\left[ \frac{(c\rho)^c}{c!(1 - \rho)}+\sum_{j=0}^{c-1}\frac{(c\rho)^j}{j!} \right]^{-1} \)
        
        Then we can find the limiting distribution: \(\pi_i = \left\{ \begin{array}{lr} \frac{(c\rho)^j}{j!}\frac{c!(1-\rho)C(c,\rho)}{(c\rho)^c}, & \text{ if } j < c \\ \frac{(c\rho)^c}{c!}\rho^{j-c}\pi_0, & \text{ if } j \geq c \end{array}\right\}\)
        
        Then \(L_q =\sum_{j=c}^\infty (j-c)\pi_j = \frac{\rho}{1-\rho}C(c,\rho)\), where \(\sum_{j=c}^\infty\pi_j = C(c,\rho)\) is the fraction of time that all servers are busy, and by the PASTA property it is also the amount of time that customers have to wait.
        
        This also gives the formulas for \(W, L, W_q\) and that \(W_q(x) = 1 -C(c,\rho)e^{-c\mu(1-\rho)x}\)
    - - we can minimize the average cost per unit time by seeing which customer type i: \(\frac{c_i}{\mu_i} \geq \frac{c_j}{\mu_j}\) where c is the cost of that type of customer, and mu is the service time. Then we should service type 1 more than others.
    - - \(\rho = \frac{\lambda \text E(B)}{c}\)
        
        The queue length can be approximated by: \( \hat L_q = (1-c^2_B)L_q(\text{Det}) + c^2_B L_q(\text{Exp})\) Where exp is the queue length for exponential and det is the queue length for the deterministic queue.
        
        There are some other formulas but they are nasty so i won't write them here lol
    - - we analyze the system as a markov chain with \(i\mu\) service times in state i
        
        we obtain that \(\pi_i = \frac{\frac{\left(\frac\lambda\mu\right)^i}{i!}}{ \sum_{k=0}^c \frac{\left(\frac\lambda\mu\right)^k}{k!}}\)
        
        at \(i = c\) this is the number of customers that are lost.
    - - Using this we can define the erlang loss formula: \(B(c,u) = \frac{\frac{u^c}{c!}}{\sum_{k=0}^c \frac{u^k}{k!}}\), where \(u = \lambda \text E(B)\)
    - - limiting case of the M/G/c/c queue
        
        \(\pi_j = e^{-\lambda \text E(B)}\frac{(\lambda \text E(B))^j}{j!}\)
        
        Insensitivity property: The distribution of the service time is irrelevant to the distribution of customers. Only the expected value of the service time distribution matters.
    - - The formulas are nasty but they are there, and can be derived using markov chain methods
- - - - The goal here is to maximize or minimize the total reward for all stages: \( \max \{ \sum_{n=0}^N r_n (i,d) \} \). This is solved recursively through \( f_n(i) = \max_{d \in D_n(i)} \{ r_n(i,d) + f_{n+1}(d) \}\), where the last d is the next state achieved through decision d
        
        For this to work, we need to know the value for the final state: \( f_N(i)\) exists for all i, this is also called the salvage value
        
        In general, we can formulate the recursion as : \(f(i) = \min \{ cost_i + f_{n+1}(j)\} \)
        
        Here, the recursion seems to move forward, but in fact if we were to solve it, we would have to solve it backwards, starting at the end case
        
        We can also add a discount factor in there, so that the recursion gives less or more emphasis to the future values. Then we get: \(f(i) = \min \{ r_n(i,d) +\beta f_{n+1}(j)\} \)
  - - - for random immediate rewards the optimization does not matter more than the expectation. So at each step, we pick the step which results in the best possible expected reward, using forward recursion.
        
        we maximize: \( \max \mathop{\mathbb{E}} \sum_{n=0}^ N r_n (i_n, d_n)\). Then the recursion is: \( \max_{d \in D_n(i)} \left(\sum_{j \in S_{n+1}} p_n(j|i,d)\left(r_n(i,j,d) + f_{n+1}(j)\right)\right)\)
        
        If the immediate reward is not related to the state we are going to (j), then we get the following recursion: \( \max_{d \in D_n(i)} \left(r_n(i,d) + \sum_{j \in S_{n+1}} p_n(j|i,d)f_{n+1}(j)\right)\)
        
        We can also define a one stage-look-ahead rule, but finding all the non-stopping actions which can still lead to an optimal result. Just read 223 in the book lol
  - - - Furthermore, \( \delta_t : S \rightarrow S\) is called a decision rule (it takes a state and time and gives another state), and a sequence of decision rules: \( \pi = (\delta_1, \delta_2, \cdots)\) is called a policy
        
        If all the decisions in a policy are independent of time, then the policy is called stationary. In other words: \( \pi = (\delta, \delta, \cdots)\)
        
        This generates a "normal" Markov chain, with a transition matrix
        
        We need a way to compare different policies. We can compare based on:
        
        Mathematical Expectation of the Present Value: \( V_\pi(i) = \textbf E _\pi\left[ \sum_{t=0}^\infty \beta ^t r(X_t, B_t) | X_0 = i\right]\), where i is a state, \( \beta\) is a discount factor and \( \pi\) is the policy being iterated on.
        
        Using the comparison functions, the policies can be compared directly. A policy, \( \pi ^*\) is called optimal if: \( V_{\pi^*}(i) = V(i), \forall i \in S\). Where V is the true expected present value of the state (compared to the predicted expectations and average reward, which are just ways of predicting the true value)
        
        We still need a way to find the best policy, because in most cases, the function: \( V : S \rightarrow \textbf R\) is unknown.
        
        Therefore, we solve the following: \( V(i) = \max_{a \in D(i)} \left\{ r(i,a) + \beta \sum_{j \in S}p(j | i,a)V(j) \right\}\). This is basically finding the best move to make from a starting state, i, using recursion. However, this is hard to solve because V is the unknown and it occurs with recursion on both sides.
        
        3 more items...
        
        Policy Iteration: we take a given policy, evaluate the value function on it directly. (without the Max operator) and then use those values to calculate another iteration using the value iteration equation. If the policy extracted from the next iteration is better, then we get another policy, if not, we're done.
        
        Mathematical Expectation of the Average Reward: \( \bar V_\pi(i) = \textbf E _\pi\left[ \lim_{n \rightarrow \infty} \frac 1n \sum_{t=0}^{n-1}r(X_t, B_t) | X_0 = i\right]\), where i is a state, and \( \pi\) is the policy being iterated on.
        
        If the Markov chain is finite, then there is one stationary policy which is optimal out of all possible (stationary and nonstationary) policies
        
        For a given policy R, we define \(\pi_j(R)\) to be the equilibrium probabilities for the chain, in state j.
        
        Using this we can derive an equation for the average cost for a policy R, \(g(R) = \sum_{j \in S}c_j(R_j)\pi_j(R)\). This is what we use to compare different policies.
        
        We can also define the relative value function: \( \lim_{n \rightarrow \infty} V_n(i,R) - ng(R) = v_i(R)\), where \(V_n(i,R)\) is the cost of policy R starting from state i over n epochs.
        
        1 more item...