(XY)∼N([μXμY],[σ2XρσXσYρσXσYσ2Y]),
YXY∣X∼N(μY+ρσYX−μXσX,σY[1−ρ2]).
In the present case, we have
u1∣v2∼N(0+η1⋅τ⋅1v2−0τ,1⋅[1−(η1⋅τ)2])=N(ητ2v2,1−η2τ2),
which means that
u1=ητ2v2+ξ
where (and this was your first mistake)
ξ∼N(0,1−η2τ2).
We can thus rewrite the first equation
y∗1=δ1z1+α1y2+u1=δ1z1+α1y2+ητ2v2+ξ=δ1z1+α1y2+ητ2(y2−zδ)+ξ.
Now, remember that the conditional probability density function of X=x given Y=y is
fX(x∣y)=fXY(x,y)fY(y).
In the present case, we have
f1(y1∣y2,z)=f12(y1,y2∣z)f2(y2∣z),
which can be rearranged to your expression
f12(y1,y2∣z)=f1(y1∣y2,z)f2(y2∣z).
Then, we can write the likelihood as a function of the densities of the two independent shocks v1,ξ1:
L(y1,y2∣z)=∏inf1(y1i∣y2i,zi)f2(y2i∣zi)=∏inPr(y1i=1)y1iPr(y1i=0)1−y1if2(y2i∣zi)=∏inPr(y∗1i>0)y1iPr(y∗1i≤0)1−y1if2(y2i∣zi)=∏inPr(δ1z1i+α1y2i+ητ2(y2i−ziδ)+ξi>0)y1iPr(δ1z1i+α1y2i+ητ2(y2i−ziδ)+ξi≤0)1−y1if2(y2i∣zi)=∏inPr(ξi>−[δ1z1i+α1y2i+ητ2(y2i−ziδ)])y1iPr(ξi≤−[δ1z1i+α1y2i+ητ2(y2i−ziδ)])1−y1if2(y2i∣zi)=∏inPr⎛⎝⎜ξi−01−η2τ2−−−−−√>−δ1z1i+α1y2i+ητ2(y2i−ziδ)+01−η2τ2−−−−−√⎞⎠⎟y1iPr⎛⎝⎜ξi−01−η2τ2−−−−−√≤−δ1z1i+α1y2i+ητ2(y2i−ziδ)+01−η2τ2−−−−−√⎞⎠⎟1−y1if2(y2i∣zi)=∏inPr⎛⎝⎜ξi1−η2τ2−−−−−√>−wi⎞⎠⎟y1iPr⎛⎝⎜ξi1−η2τ2−−−−−√≤−wi⎞⎠⎟1−y1if2(y2i∣zi)=∏in⎡⎣⎢1−Pr⎛⎝⎜ξi1−η2τ2−−−−−√≤−wi⎞⎠⎟⎤⎦⎥y1iPr⎛⎝⎜ξi1−η2τ2−−−−−√≤−wi⎞⎠⎟1−y1if2(y2i∣zi)=∏i[1−Φ(−wi)]y1iΦ(−wi)1−y1iφ(y2i−ziδτ)=∏inΦ(wi)y1i[1−Φ(wi)]1−y1iφ(y2i−ziδτ)=Φ(w)y1[1−Φ(w)]1−y1φ(y2−zδτ)
where
wi=δ1z1i+α1y2i+ητ2(y2i−ziδ)1−η2τ2−−−−−√.
Φ(z) and
φ(z) are the cumulative density function and probability density function of the standard normal distribution.