see the discussion here for details on which methods are NOT safe: https://security.stackexchange.com/questions/129683/is-image-blurring-an-unsafe-method-to-obfuscate-information-in-images
This is what I would imagine it would need to look like:
[video input] -> [frames]
per frame:
[frame] -> run thru a person detection segmentation nn -> output an outline mask for the frame
fill in the outline mask in the frame with eg. black
reconstitute [set of frames] -> [output video]