You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're trying to create a workflow using Schema Automator that requires as little intervention as possible after generating a schema from multiple TSVs. An issue we've run into is being able to specify the class names derived from different files.
When running schemauto generalize-tsvs filea.tsv fileb.tsv filec.tsv, the names of the resulting classes are derived from the individual filenames. That derivation happens here:
I decided to write a small function as a replacement to CSVDataGeneralizer.convert_multiple that allows me to set class names explicitly. (And also set some metadata like id, name and description which are not configurable themselves).
This works fine, except that I'm not able to infer foreign keys using this method-- the reason being that CSVDataGeneralizer.infer_linkages uses the same method for deriving class names:
Should it be possible to explicitly specify class names here? I'm not sure what a CLI flag would look like that allows this. Maybe something like schemauto generalize-tsvs --class ClassA=filea.tsv --class ClassB=fileb.tsv --class ClassC=filec.tsv.
For the time being, I can run the builtin convert_multiple function and then replace any values in the resulting schema in code. (EDIT: That was a poor idea. The better workaround is just to create a soft link of the file where the file name is the desired class name).
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
We're trying to create a workflow using Schema Automator that requires as little intervention as possible after generating a schema from multiple TSVs. An issue we've run into is being able to specify the class names derived from different files.
When running
schemauto generalize-tsvs filea.tsv fileb.tsv filec.tsv
, the names of the resulting classes are derived from the individual filenames. That derivation happens here:schema-automator/schema_automator/generalizers/csv_data_generalizer.py
Lines 249 to 250 in 99aff03
I decided to write a small function as a replacement to
CSVDataGeneralizer.convert_multiple
that allows me to set class names explicitly. (And also set some metadata likeid
,name
anddescription
which are not configurable themselves).This works fine, except that I'm not able to infer foreign keys using this method-- the reason being that
CSVDataGeneralizer.infer_linkages
uses the same method for deriving class names:schema-automator/schema_automator/generalizers/csv_data_generalizer.py
Lines 130 to 131 in 99aff03
Should it be possible to explicitly specify class names here? I'm not sure what a CLI flag would look like that allows this. Maybe something like
schemauto generalize-tsvs --class ClassA=filea.tsv --class ClassB=fileb.tsv --class ClassC=filec.tsv
.For the time being, I can run the builtin
convert_multiple
function and then replace any values in the resulting schema in code. (EDIT: That was a poor idea. The better workaround is just to create a soft link of the file where the file name is the desired class name).The text was updated successfully, but these errors were encountered: