Differences

This shows you the differences between two versions of the page.

--- notes:soliton [2016/08/15] – gomida
+++ notes:soliton [2022/11/17] (current) – removed gomida
@@ Line 1: / Line 1: @@
-====== 솔리톤 분포 ======
-===== Soliton Distribution =====
-<sq>
-==== Ideal Soliton Distribution ====
-=== Definition ===
-| $\rho(1) = 1/k$ ||
-| $\rho(i)=1/i(i − 1),$ | $(i = 2,...,k)$ |
-\\
-위 정의에 의해 $k\geq 1,\thinspace i\in\mathbb{N}$ 일 때, Probability Distribution의 조건 $\sum_i\rho(i)=1$ 을 만족한다. 따라서 별도의 normalization은 필요 없지만, 코드로 구현해보면 자료형의 정밀도 한계에 의해 누적합이 1의 근사치로 나타나기도 한다.
-\\
-\\
-=== Graph ===
-<HTML><div style="max-width:400px;margin:0 auto;text-align:center;"></HTML>
-{{:notes:soliton.png?nolink}}
-$$k = 10$$
-<HTML></div></HTML>
-\\
-=== Reference ===
-[[https://docs.switzernet.com/people/emin-gabrielyan/060708-thesis-ref/papers/Luby02.pdf|M. Luby "LT-codes," in Proc. 43rd Annu. IEEE Symp. Foundations of Computer Science (FOCS), Vancouver, BC, Canada, Nov. 2002, pp. 271–280.]]
-</sq>
-<sq>
-==== Math Symbols ====
-| $P$ | $\rho$ | rho |
-| $T$ | $\tau$ | tau |
-| $M$ | $\mu$ | mu |
-| $D$ | $\delta$ | delta |
-</sq>
-<sq>
-==== Robust Soliton Distribution ====
-=== Definition ===
-| $\mu(i)=(\rho(i)+\tau(i)) / (\sum\rho(i)+\tau(i))$ |
-Robust Soliton Distribution $\mu(i)$는 Ideal Soliton Distribution $\rho(i)$에 $\tau(i)$를 더하고 누적합이 1이 되도록 normalization을 수행한 것이다.
-\\
-\\
-| $\tau(i)= 1/im,$ | $(i=1,2,\dots,m-1)$ |
-| $\tau(i)= ln((k/m)/\delta)/m,$ | $(i=m)$ |
-| $\tau(i)= 0,$ | $(i=m+1,\dots,k)$ |
-$\tau(i)$의 정의는 논문보다는 $k/R$을 m으로 표기하는 위키피디아의 표기방법이 더 직관적인 것으로 생각되어 이를 따랐다. 논문에서 정의하는 $R$은 상수 $c$와 $\delta$에 의해 그 값이 결정되는데, 최종적으로는 자연수 m으로 표현되기 때문이다.
-\\
-\\
-=== Graph ===
-<HTML><div style="max-width:400px;margin:0 auto;text-align:center;"></HTML>
-{{:notes:robust_soliton.png?nolink}}
-$$k = 10, m=4, \delta=0.01$$
-<HTML></div></HTML>
-</sq>
-<sq>
-==== CDF ====
-[[notes/gofountain]]하다 보면 솔리톤분포 생성함수를 CDF(Cumulative Distribution Function, 누적분포함수)라고 지칭하고 있는데 PDF와의 차이점을 공부하다가 정리해본다. 우선 용어에서 혼란이 있었는데 정리하자면 하기와 같다.
-| Probability Distribution Function | Probability Mass Function | Discrete variable |
-| ::: | Probability Density Function | Continuous variable |
-| ::: | Cumulative Distribution Function | PMF의 누적합 또는 PDF의 적분 |
-먼저 CDF의 정의를 찾아보자. A real-valued random variable X 대한 함수의 정의는,
-$$F(x) = P(X\leq x)$$
-이 때, P는 임의의 실수 X가 x보다 작거나 같을 확률이다.
-Continuous random variables 에서는 PDF(Probability Density Function)에 대한 적분이고,
-$$F(x)=\int_{-\infty}^x f(t)dt$$
-Discrete random variables 에서는 PMF(Probability Mass Function)에 대한 누적합이다.
-$$F(x)=\sum\limits_{t \leq x} f(t)$$
-</sq>
-<sq>
-==== Golang ====
-Reference
-  * https://github.com/google/gofountain/blob/master/util.go
-=== Ideal Soliton Distribution ===
-<code go>
-package main
-import(
-	"fmt"
-)
-func SolitonDistribution(n int) []float64 {
-	cdf := make([]float64, n+1)
-	cdf[1] = 1 / float64(n)
-	for i := 2; i < len(cdf); i++ {
-		cdf[i] = cdf[i-1] + (1 / (float64(i) * float64(i-1)))
-	}
-	return cdf
-}
-func main() {
-	for i:=1;i<20;i++{
-		fmt.Println(SolitonDistribution(i))
-	}
-}
-</code>
-<code>
-MacBook-Pro-Retina:gofountain-LT-experiment gomida$ go run soliton.go
-[0 1]
-[0 0.5 1]
-[0 0.3333333333333333 0.8333333333333333 0.9999999999999999]
-[0 0.25 0.75 0.9166666666666666 1]
-[0 0.2 0.7 0.8666666666666666 0.95 1]
-[0 0.16666666666666666 0.6666666666666666 0.8333333333333333 0.9166666666666666 0.9666666666666667 1]
-[0 0.14285714285714285 0.6428571428571428 0.8095238095238094 0.8928571428571428 0.9428571428571428 0.9761904761904762 1]
-[0 0.125 0.625 0.7916666666666666 0.875 0.925 0.9583333333333334 0.9821428571428572 1]
-[0 0.1111111111111111 0.6111111111111112 0.7777777777777778 0.8611111111111112 0.9111111111111112 0.9444444444444445 0.9682539682539684 0.9861111111111113 1.0000000000000002]
-[0 0.1 0.6 0.7666666666666666 0.85 0.9 0.9333333333333333 0.9571428571428572 0.9750000000000001 0.9888888888888889 1]
-[0 0.09090909090909091 0.5909090909090909 0.7575757575757576 0.8409090909090909 0.890909090909091 0.9242424242424243 0.9480519480519481 0.965909090909091 0.9797979797979799 0.990909090909091 1]
-[0 0.08333333333333333 0.5833333333333334 0.75 0.8333333333333334 0.8833333333333334 0.9166666666666667 0.9404761904761906 0.9583333333333335 0.9722222222222223 0.9833333333333334 0.9924242424242424 1]
-[0 0.07692307692307693 0.5769230769230769 0.7435897435897435 0.8269230769230769 0.8769230769230769 0.9102564102564102 0.9340659340659341 0.951923076923077 0.9658119658119658 0.9769230769230769 0.9860139860139859 0.9935897435897435 0.9999999999999999]
-[0 0.07142857142857142 0.5714285714285714 0.738095238095238 0.8214285714285714 0.8714285714285714 0.9047619047619048 0.9285714285714286 0.9464285714285715 0.9603174603174603 0.9714285714285714 0.9805194805194805 0.988095238095238 0.9945054945054944 0.9999999999999999]
-[0 0.06666666666666667 0.5666666666666667 0.7333333333333333 0.8166666666666667 0.8666666666666667 0.9 0.9238095238095239 0.9416666666666668 0.9555555555555556 0.9666666666666667 0.9757575757575757 0.9833333333333333 0.9897435897435897 0.9952380952380951 0.9999999999999999]
-[0 0.0625 0.5625 0.7291666666666666 0.8125 0.8625 0.8958333333333334 0.9196428571428572 0.9375000000000001 0.951388888888889 0.9625 0.9715909090909091 0.9791666666666666 0.985576923076923 0.9910714285714285 0.9958333333333332 0.9999999999999999]
-[0 0.058823529411764705 0.5588235294117647 0.7254901960784313 0.8088235294117647 0.8588235294117648 0.8921568627450981 0.9159663865546219 0.9338235294117648 0.9477124183006537 0.9588235294117647 0.9679144385026738 0.9754901960784313 0.9819004524886877 0.9873949579831932 0.992156862745098 0.9963235294117646 0.9999999999999999]
-[0 0.05555555555555555 0.5555555555555556 0.7222222222222222 0.8055555555555556 0.8555555555555556 0.888888888888889 0.9126984126984128 0.9305555555555557 0.9444444444444445 0.9555555555555556 0.9646464646464646 0.9722222222222222 0.9786324786324786 0.9841269841269841 0.9888888888888888 0.9930555555555555 0.9967320261437908 0.9999999999999999]
-[0 0.05263157894736842 0.5526315789473684 0.719298245614035 0.8026315789473684 0.8526315789473684 0.8859649122807017 0.9097744360902256 0.9276315789473685 0.9415204678362573 0.9526315789473684 0.9617224880382774 0.969298245614035 0.9757085020242914 0.9812030075187969 0.9859649122807016 0.9901315789473683 0.9938080495356035 0.9970760233918127 0.9999999999999998]
-MacBook-Pro-Retina:gofountain-LT-experiment gomida$
-</code>
-각 라인의 마지막 값을 살펴보면 누적 합이 1이 아닌 경우들을 관찰 할 수 있다. 이는 분수를 float 자료형으로 풀어 계산 후 합산했기 때문으로 수학적으로는 1이 되어야 한다.
-\\
-\\
-=== Ideal Soliton Distribution ===
-<code go>
-package main
-import(
-	"fmt"
-)
-func robustSolitonDistribution(n int, m int, delta float64) []float64 {
-	pdf := make([]float64, n+1)
-	pdf[1] = 1/float64(n) + 1/float64(m)
-	total := pdf[1]
-	for i := 2; i < len(pdf); i++ {
-		pdf[i] = (1 / (float64(i) * float64(i-1)))
-		if i < m {
-			pdf[i] = 1 / (float64(i) * float64(m))
-		}
-		if i == m {
-			pdf[i] += math.Log(float64(n)/(float64(m)*delta)) / float64(m)
-		}
-		total += pdf[i]
-	}
-	cdf := make([]float64, n+1)
-	for i := 1; i < len(pdf); i++ {
-		pdf[i] /= total
-		cdf[i] = cdf[i-1] + pdf[i]
-	}
-	return cdf
-}
-func main() {
-	for i:=1;i<20;i++{
-		fmt.Println(robustSolitonDistribution(10, i, 0.01))
-	}
-}
-</code>
-</sq>
-<sq>
-==== Python ====
-Reference
-  * http://stats.stackexchange.com/questions/37581/how-do-i-generate-numbers-according-to-a-soliton-distribution/37583#37583
-<code python>
-from __future__ import print_function, division
-import random
-from math import ceil
-def soliton(N, seed):
-  prng = random.Random()
-  prng.seed(seed)
-  while 1:
-    x = random.random() # Uniform values in [0, 1)
-    i = int(ceil(1/x))       # Modified soliton distribution
-    yield i if i <= N else 1 # Correct extreme values to 1
-if __name__ == '__main__':
-  N = 10
-  T = 10 ** 5 # Number of trials
-  s = soliton(N, random.randint(0, 2 ** 32 - 1))
-  f = [0]*N
-  for j in range(T):
-    i = next(s)
-    f[i-1] += 1
-  print("k\tFreq.\tExpected Prob\tObserved Prob\n");
-  print("{:d}\t{:d}\t{:f}\t{:f}".format(1, f[0], 1/N, f[0]/T))
-  for k in range(2, N+1):
-    print("{:d}\t{:d}\t{:f}\t{:f}".format(k, f[k-1], 1/(k*(k-1)), f[k-1]/T))
-</code>
-<code>
-MacBook-Pro-Retina:work gomida$ python soliton.py
-k	Freq.	Expected Prob	Observed Prob
-	10148	0.100000	0.101480
-	49794	0.500000	0.497940
-	16661	0.166667	0.166610
-	8335	0.083333	0.083350
-	5031	0.050000	0.050310
-	3306	0.033333	0.033060
-	2469	0.023810	0.024690
-	1771	0.017857	0.017710
-	1419	0.013889	0.014190
-	1066	0.011111	0.010660
-MacBook-Pro-Retina:work gomida$
-</code>
-</sq>

Differences

TypeError: Cannot access offset of type string on string