Dataset | Model | $t_1$ | $t_2$ | $t_3$ | $t_4$ | $t_5$ | $t_6$ | $t_7$ | $t_8$ |
---|---|---|---|---|---|---|---|---|---|
WN18RR | BERT-Large PubMedBERT BART-Large Flan-T5-Large BLOOM-1b7 Flan-T5-XL BLOOM-3b LLaMA-7B GPT-3 GPT-3.5 GPT-4 Flan-T5-Large* Flan-T5-XL* |
2.196 - 0.010 0.179 66.832 2.819 63.336 37.265 15.322 24.276 - 73.326 84.519 |
9.366 - 0.285 19.704 71.531 40.263 75.290 74.614 26.557 80.813 - 76.747 84.772 |
9.186 - 0.221 5.543 79.208 17.835 79.081 70.168 37.866 89.461 - 54.572 77.275 |
19.419 - 2.164 31.267 76.842 52.217 77.064 75.976 27.571 91.721 90.116 76.906 86.282 |
4.720 - 0.010 0.0 40.084 0.010 37.402 24.287 8.479 0.813 - 10.834 50.232 |
19.345 - 0.031 3.030 61.964 7.750 65.322 76.610 27.138 60.760 - 61.362 76.462 |
9.936 - 0.0 5.702 68.394 18.479 68.996 67.412 27.518 49.387 - 54.297 72.386 |
27.856 - 0.190 26.800 70.031 18.859 71.626 81.383 24.656 82.418 - 69.324 80.517 |
GeoNames | BERT-Large PubMedBERT BART-Large Flan-T5-Large BLOOM-1b7 Flan-T5-XL BLOOM-3b LLaMA-7B GPT-3 GPT-3.5 GPT-4 Flan-T5-Large* Flan-T5-XL* |
38.345 - 8.474 11.552 2.716 33.818 3.765 29.495 22.426 35.000 43.286 15.089 18.358 |
29.798 - 0.572 3.579 2.549 15.719 4.702 14.169 8.727 - - 15.174 18.120 |
30.867 - 2.237 13.165 2.895 19.771 2.646 25.547 - - - 14.937 18.120 |
35.328 - 0.984 4.685 3.206 20.789 3.438 15.954 7.500 - - 15.128 17.912 |
23.610 - 21.480 9.451 28.517 15.365 28.844 13.919 - - - 15.774 17.265 |
25.666 - 20.513 6.057 18.382 12.413 18.084 9.446 - - - 16.282 17.328 |
11.320 - 7.830 8.178 25.866 18.439 25.646 17.794 - - - 15.934 17.450 |
30.447 - 23.218 7.387 19.809 15.823 20.718 16.795 - - - 16.918 17.641 |
NCI | BERT-Large PubMedBERT BART-Large Flan-T5-Large BLOOM-1b7 Flan-T5-XL BLOOM-3b LLaMA-7B GPT-3 GPT-3.5 GPT-4 Flan-T5-Large* Flan-T5-XL* |
9.948 5.876 7.095 4.591 12.031 4.441 13.770 3.788 9.303 11.045 - 30.605 31.511 |
9.765 5.369 7.876 5.069 12.106 5.656 14.352 4.054 9.174 9.523 - 31.594 30.996 |
2.611 4.520 5.144 7.531 11.220 7.419 12.946 3.248 11.033 14.709 16.057 31.328 32.784 |
2.902 2.790 6.325 8.966 12.435 9.831 14.418 4.778 12.742 14.227 - 31.927 32.052 |
11.095 3.368 9.103 3.069 10.954 2.125 14.264 3.672 9.374 8.563 - 29.116 30.014 |
10.966 1.613 9.943 4.250 10.451 3.297 14.069 3.921 8.754 8.130 - 29.282 29.706 |
1.127 1.339 7.240 5.485 11.133 3.871 14.926 5.252 9.141 12.684 - 31.291 31.769 |
1.364 0.657 8.267 5.843 11.499 6.284 15.562 7.714 9.112 11.249 - 30.796 31.357 |
SNOMEDCT_US | BERT-Large PubMedBERT BART-Large Flan-T5-Large BLOOM-1b7 Flan-T5-XL BLOOM-3b LLaMA-7B GPT-3 GPT-3.5 GPT-4 Flan-T5-Large* Flan-T5-XL* |
19.839 28.488 19.164 19.264 32.433 25.213 34.264 7.560 21.066 21.812 - 32.272 43.395 |
8.022 22.477 19.810 19.898 37.023 26.230 37.694 6.754 20.333 17.991 - 31.992 42.038 |
1.066 13.910 4.168 21.040 13.782 30.095 27.180 7.890 22.730 25.026 22.368 31.561 42.766 |
0.125 5.703 4.046 24.322 19.978 31.650 27.878 8.063 24.365 24.506 27.835 31.366 41.750 |
21.109 7.964 17.541 8.078 29.486 7.219 31.061 10.748 19.207 18.245 - 32.005 40.898 |
12.766 3.586 17.890 8.901 30.400 8.221 32.213 10.808 18.995 15.711 - 31.508 40.316 |
0.458 2.299 10.061 11.541 31.249 15.586 33.298 13.154 20.208 22.718 - 33.393 42.605 |
0.048 1.513 9.434 12.924 33.864 17.221 35.474 13.818 20.097 19.873 - 33.058 42.482 |
Medcin | BERT-Large PubMedBERT BART-Large Flan-T5-Large BLOOM-1b7 Flan-T5-XL BLOOM-3b LLaMA-7B GPT-3 GPT-3.5 GPT-4 Flan-T5-Large* Flan-T5-XL* |
7.332 15.628 11.679 9.302 27.583 15.248 23.055 3.406 22.406 22.511 - 38.375 51.809 |
1.250 9.718 12.655 8.082 28.673 15.895 28.310 2.806 22.560 22.066 - 36.377 50.900 |
0.141 5.209 2.271 10.974 2.702 18.041 14.395 3.370 25.726 23.929 21.255 37.436 51.800 |
0.059 1.586 2.317 12.963 4.975 18.519 10.826 3.730 24.915 23.588 23.613 35.868 51.160 |
8.712 5.688 9.403 2.892 26.389 4.473 22.585 4.904 19.759 20.465 - 31.262 47.880 |
1.198 2.320 9.227 3.590 28.760 5.446 24.238 4.473 17.808 19.848 - 30.009 45.388 |
0.088 1.272 5.470 6.712 26.897 11.143 27.303 3.175 19.921 22.372 - 33.112 49.865 |
0.012 0.613 4.825 6.781 26.694 11.096 29.811 3.804 18.572 20.231 - 31.911 49.094 |
Flan-T5-Large*
and Flan-T5-XL*
are Few-shot learning results for datasets.