Skip to content

cold start handling in ranked batch sampling #28

@zhangyu94

Description

@zhangyu94

Hi!

The behavior of cold start handling in ranked batch sampling seems different from the Cardoso et al.'s "Ranked batch-mode active learning".

modAL/modAL/batch.py

Lines 133 to 139 in 452898f

if classifier.X_training is None:
labeled = select_cold_start_instance(X=unlabeled, metric=metric, n_jobs=n_jobs)
elif classifier.X_training.shape[0] > 0:
labeled = classifier.X_training[:]
# Define our record container and the maximum number of records to sample.
instance_index_ranking = []

In modAL's implementation, in the case of cold start, the instance selected by select_cold_start_instance is not added to the instance list instance_index_ranking.
While in "Ranked batch-mode active learning", the instance selected by select_cold_start_instance seems to be the first item in instance_index_ranking.

return X[best_coldstart_instance_index].reshape(1, -1)

If my understanding on the algorithm proposed in the paper and modAL's implementation is correct, we can change the return of select_cold_start_instance to
return best_coldstart_instance_index, X[best_coldstart_instance_index].reshape(1, -1),
store best_coldstart_instance_index in instance_index_ranking, and revise ranked_batch correspondingly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions