Adaptation
s(t)=ke−t−(k−1)e−kk−1t,k>1.
This s(t) (which Ted Dunning calls 'Canny's filter') has desirable properties of s′(0)=0 and s(0)=1.
In our case, it takes form of
y(x)=ke−x/α−(k−1)e−kx/α(k−1)
Given margin m and decay time Δt , so that
y(Δt)=m,
solving for α, we get estimate for α assuming margin m≪1 as
α≈−Δtln(m/k).
I take Δt as constructor input with good default values of k=4 and m=0.01.
Iterative update tn+1≥tn
{{π0=1,ν0=1,π=e−(tn+1−tn)/α;ν=e−k(tn+1−tn)/α(k−1);wn+1=1+πwn;vn+1=1+νvn;sn+1=xn+1+πsn;un+1=xn+1+νun;tn+1=tn+1.
Then average μ=ks−(k−1)ukw−(k−1)v.
Persisted summarizer state thus consists of parameters {w,v,s,u,t,k,α} . Parameters k and α are set in constructor and thus don't evolve.
Iterative update in-the-past tn+1<tn
{{π0=1,ν0=1;π=e−(tn−tn+1)/α,ν=e−k(tn−tn+1)/α(k−1);wn+1=wn+π,vn+1=vn+ν,sn+1=sn+πxn+1,un+1=un+νxn+1,tn+1=tn.
Combining Canny summarizers
Combining two summarizer states S1,S2 having observed two disjoint sets of original observation superset (for algebraic pig UDF, MapReduce use):
Case t2≥t1 :
{t=t2;π=e−(t2−t1)/α;ν=e−k(t2−t1)/α(k−1);s=s2+s1π;u=u2+u1ν;w=w2+w1π;v=v2+v1ν.
Merging in positive samples only for binomial cases
Case t(1)≥t(0) :
{t=t(1);π=e−(t(1)−t(0))/α;ν=e−k(t(1)−t(0))/α(k−1);s=s(1)+s(0)π;u=u(1)+u(0);w=w(0)π;v=v(0)ν.
Case t(1)<t(0) :
{t=t(0);(nochange)π=e−(t(0)−t(1))/α;ν=e−k(t(0)−t(1))/α(k−1);s=s(0)+s(1)π;u=u(0)+u(1)ν;w=w(0);(nochange)v=v(0)(nochange).
Pecularities of Canny rate summarizer implementation
Assuming π0=1,ν0=1 , but otherwise
π=exp(−|tn+1−tn|α),ν=exp(−k⋅|tn+1−tn|α(k−1)),
for the case of tn+1≥tn :
{wn+1=πwn+α(1−π);vn+1=νvn+α(k−1)(1−ν)k;
and for the case of tn+1<tn :
{wn+1=max(wn,α(1−π));vn+1=max(vn,α(k−1)(1−ν)k).
(of course in the latter case if wn≥α(1−π)≡vn≥α(k−1)(1−ν)k , so we can optimize a little based on that).
The logic of advancing sn+1 and un+1 is the same as in Canny average summarizer.
Pecularities of combining Canny rate summarizers
Assuming both summarizers have nonzero history and α(1)=α(2)=α, and k(1)=k(2)=k as a precondition, combining routine looks like
π=exp(−|t(2)−t(1)|α),ν=exp(−k⋅|t(2)−t(1)n|α(k−1)),{{t=t(1)s=s(1)+πs(2)u=u(1)+νu(2),t(1)≥t(2);{t=t(2)s=s(2)+πs(1)u=u(2)+νu(1),t(1)<t(2);w=max(w(1),w(2)),allcases;v=max(v(1),v(2)),allcases.
In simple words, here we take longest history of two (expressed by denominator terms w and v ), and merge numerators based on which history contained the latest observation. Again, we can assume w(1)≥w(2)≡v(1)≥v(2) to save on comparisons a little bit.
Iterative update in-the-past tn+1<tn
{{π0=1,ν0=1;π=e−(tn−tn+1)/α,ν=e−k(tn−tn+1)/α(k−1);wn+1=wn+π,vn+1=vn+ν,sn+1=sn+πxn+1,un+1=un+νxn+1,tn+1=tn.
Combining Canny summarizers
Combining two summarizer states S1,S2 having observed two disjoint sets of original observation superset (for algebraic pig UDF, MapReduce use):
Case t2≥t1 :
{t=t2;π=e−(t2−t1)/α;ν=e−k(t2−t1)/α(k−1);s=s2+s1π;u=u2+u1ν;w=w2+w1π;v=v2+v1ν.
Merging in positive samples only for binomial cases
Case t(1)≥t(0) :
{t=t(1);π=e−(t(1)−t(0))/α;ν=e−k(t(1)−t(0))/α(k−1);s=s(1)+s(0)π;u=u(1)+u(0);w=w(0)π;v=v(0)ν.
Case t(1)<t(0) :
{t=t(0);(nochange)π=e−(t(0)−t(1))/α;ν=e−k(t(0)−t(1))/α(k−1);s=s(0)+s(1)π;u=u(0)+u(1)ν;w=w(0);(nochange)v=v(0)(nochange).
Pecularities of Canny rate summarizer implementation
Assuming π0=1,ν0=1 , but otherwise
π=exp(−|tn+1−tn|α),ν=exp(−k⋅|tn+1−tn|α(k−1)),
for the case of tn+1≥tn :
{wn+1=πwn+α(1−π);vn+1=νvn+α(k−1)(1−ν)k;
and for the case of tn+1<tn :
{wn+1=max(wn,α(1−π));vn+1=max(vn,α(k−1)(1−ν)k).
(of course in the latter case if wn≥α(1−π)≡vn≥α(k−1)(1−ν)k , so we can optimize a little based on that).
The logic of advancing sn+1 and un+1 is the same as in Canny average summarizer.
Pecularities of combining Canny rate summarizers
Assuming both summarizers have nonzero history and α(1)=α(2)=α, and k(1)=k(2)=k as a precondition, combining routine looks like
π=exp(−|t(2)−t(1)|α),ν=exp(−k⋅|t(2)−t(1)n|α(k−1)),{{t=t(1)s=s(1)+πs(2)u=u(1)+νu(2),t(1)≥t(2);{t=t(2)s=s(2)+πs(1)u=u(2)+νu(1),t(1)<t(2);w=max(w(1),w(2)),allcases;v=max(v(1),v(2)),allcases.
In simple words, here we take longest history of two (expressed by denominator terms w and v ), and merge numerators based on which history contained the latest observation. Again, we can assume w(1)≥w(2)≡v(1)≥v(2) to save on comparisons a little bit.
No comments:
Post a Comment