Skip to content

PiML scored test and embeddings - unique scenario #58

Open
@sachinvs7

Description

@sachinvs7

Hi, very clever thinking on PiML's scored testing not requiring the actual model object.

Context:
I have the input features, target response data, and the corresponding model predictions - all for a particular dataset and task.
The problem is the features (one text, remaining non-text) are "transformed"/learned via embeddings in the original model. Text feature is via huggingface fine-tuning and non-text is via FT-Transformer. Then the original model brings both sets of embeddings together to make predictions via a fusion MLP (3 components in total).

I am not concerned about the text feature and I figured I can pass in the tuned n-dimensional representation of the text feature as n-d columns (because the probabilities factor the full X or i/p); and then apply scored testing for all original non-text features as it is.

1. All features don't start off as embeddings. But the final prediction result and probabilities are. Am I at risk of misleading results? Because my non-text is also treated as embeddings by the original model.
2. Would sincerely appreciate it if you can share thoughts as to how I can apply scored testing here/PiML in general.

@ajzhanghk @ZebinYang @simoncos @CnBDM-Su

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions