I'm can neither claim to be a photographer nor that https://www.dansmithphotogra...

I'm can neither claim to be a photographer nor that https://www.dansmithphotography.com/ my website, but I appreciate the example! The specific photo for other's reference, based on the filename: https://payload.cargocollective.com/1/15/509333/14386490/L-o...

That said I'm not as impressed of the description. The structure has some wood but it's certainly not just wooden, there are distant mountains but not much in the way of rolling hills to speak of. The dress is flowing but the waist is not knotted - the more striking note might have been the sleeves.

For 4 GB of model I'm not going to ding it too badly though. The question on which quant was mainly around the tokens/second angle (q4 requires 1/4th the memory bandwidth as the full model would) rather than quality angle. As a note: a larger multimodal model gets all of these points accurately (e.g. "wooden and stone rustic structure"), they aren't just things I noted myself.